SqLite C# extremely slow on update

2.9k views Asked by At

I'm really struggling to iron out this issue. When I use the following code to update my database for large numbers of records it runs extremely slow. I've got 500,000 records to update which takes nearly an hour. During this operation, the journal file grows slowly with little change on the main SQLite db3 file - is this normal?

The operation only seems to be a problem when I have large numbers or records to update - it runs virtually instantly on smaller numbers of records.

Some other operations are performed on the database prior to this code running so could they be some how tying up the database? I've tried to ensure that all other connections are closed properly.

Thanks for any suggestions

using (SQLiteConnection sqLiteConnection = new SQLiteConnection("Data Source=" + _case.DatabasePath))
{
    sqLiteConnection.Open();
    using (SQLiteCommand sqLiteCommand = new SQLiteCommand("begin", sqLiteConnection))
    {
        sqLiteCommand.ExecuteNonQuery();
        sqLiteCommand.CommandText = "UPDATE CaseFiles SET areaPk = @areaPk, KnownareaPk = @knownareaPk WHERE mhash = @mhash";
        var pcatpk = sqLiteCommand.CreateParameter();
        var pknowncatpk = sqLiteCommand.CreateParameter();
        var pmhash = sqLiteCommand.CreateParameter();
        pcatpk.ParameterName = "@areaPk";
        pknowncatpk.ParameterName = "@knownareaPk";
        pmhash.ParameterName = "@mhash";
        sqLiteCommand.Parameters.Add(pcatpk);
        sqLiteCommand.Parameters.Add(pknowncatpk);
        sqLiteCommand.Parameters.Add(pmhash);
        foreach (CatItem CatItem in _knownFiless)
        {

            if (CatItem.FromMasterHashes == true)
            {
                pcatpk.Value = CatItem.areaPk;
                pknowncatpk.Value = CatItem.areaPk;
                pmhash.Value = CatItem.mhash; 
            }
            else
            {
                pcatpk.Value = CatItem.areaPk;
                pknowncatpk.Value = null;
                pmhash.Value = CatItem.mhash; 
            }
            sqLiteCommand.ExecuteNonQuery();
        }
        sqLiteCommand.CommandText = "end";
        sqLiteCommand.ExecuteNonQuery();
        sqLiteCommand.Dispose();
        sqLiteConnection.Close();
    }
    sqLiteConnection.Close();
}
4

There are 4 answers

1
MaLio On BEST ANSWER

The first thing to ensure that you have an index on mhash. Group commands into batches. Use more than one thread.

Or [inserted]

Bulk import the records to a temporary table. Create an index on the mhash column. Perform a single update statement to update the records.

0
zeFrenchy On

You need to wrap everything inside a transaction otherwise I believe SQLite will create and commit one for you for every update ... hence the slowness. You clearly know that looking at your code but I am not sure using "Begin" and "End" commands achieve the same result here, you might end up with empty transaction at start and finish instead of one wrapping everything. Try something like this instead just in case:

  using (SQLiteTransaction mytransaction = myconnection.BeginTransaction())
  {
    using (SQLiteCommand mycommand = new SQLiteCommand(myconnection))
    {
      SQLiteParameter myparam = new SQLiteParameter();

      mycommand.CommandText = "YOUR QUERY HERE";
      mycommand.Parameters.Add(myparam);

      foreach (CatItem CatItem in _knownFiless)
      {
        ...
        mycommand.ExecuteNonQuery();
      }
    }
    mytransaction.Commit();
  } 
0
Arion On

This part is most certainly your problem.

foreach (CatItem CatItem in _knownFiless)
{
....
     sqLiteCommand.ExecuteNonQuery();
}

You are looping a List(?) and executing a query against the database. That is not a good way to do it. Because database calls are quite expensive. So you might consider using another way of updating these items.

0
mafu On

The SQL code appears to be okay. The C# code is not wrong, but it has some redundancy (explicit close/dispose is not needed since you're using a using already).

There is a for loop on _knownFiless (intended with double s?), could that run slowly possibly? It is unusual to run a query in a for loop against the DB, rather you should create a query with the respective set of parameters. Consider that (especially without an index on the hash) you will perform n * m operations (n being the run count of the for loop, m being the table size).

Considering that m is around 500k, and assuming that m = n you will get 250,000,000,000 operations. That may well last an hour.

Former connections or operations should have no effect as far as I know.

You should also ensure that the internal structure of the database is not causing problems. Is there a compound index that is affected from this operation? Any foreign keys / complex contraints?