I am having some troubles with my code. I am trying to insert about 100.000 items in a Mysql table using Tasks to speed up the process. Here is my code:
List < Task > tasks = new List < Task > ();
int counter = 0;
foreach(var folder in dirs) {
object lockTarget = new object();
tasks.Add(Task.Factory.StartNew(() => {
try {
comm.CommandText = "INSERT INTO webdata(url,title) VALUES(?sUrl,?sTitle,) ON DUPLICATE KEY UPDATE url=?sUrl,title=?sTitle";
comm.Parameters.Add("?sUrl", MySql.Data.MySqlClient.MySqlDbType.VarChar).Value = url;
comm.Parameters.Add("?sTitle", MySql.Data.MySqlClient.MySqlDbType.VarChar).Value = title;
var mysql_return = comm.ExecuteNonQueryAsync();
lock(lockTarget) {
counter++;
Console.WriteLine("\rProcessing {0} of {1} - {2} {3}", counter, dirs.Length, folder, mysql_return.Status);
}
} catch (MySql.Data.MySqlClient.MySqlException ex) {
Console.WriteLine(ex.Message);
}
}));
}
Task.Factory.StartNew(() => {
Task.WaitAll(tasks.ToArray());
Console.WriteLine("Finished");
conn.Close();
});
So, when i run my code, some of the Tasks (mysql_return) are returning rantocompletion
and some Faulted
, thus only 1/2 of my entries actually inserted to the DB.
I could use ExecuteNonQuery
and no Task
but it will take ages to insert so many data.
Is the problem due to hardware(CPU can't keep up with thousands of spawned tasks) or due to my code?
Any ideas? Thanks in advance
Problem1:
comm
is modified in multiple threads it seems, that has to be a local variable.Problem2: You're not waiting for the task returned by
ExecuteNonQueryAsync
and thusTask
returned byStartNew
will not wait for the completion of theExecuteNonQueryAsync
to be finished.Eventually you close the connection assuming that all are done but they aren't. They all are started, but not completed.
You need to use await the result of
ExecuteNonQueryAsync
(for that useasync
lambda) and also callTask.UnWrap
or useTask.Run
which gives youUnWrap
for free.So your code will become something like the following: