I was wondering how to change the code below to read x number of lines only process the sql insert statement then continue to read the file by x number and process until end of file. I am new to the idea of file pointers but i understand it should be possible using fgets.
I'm hoping to change the below code into a function where I can pass the filename and the number of lines I want read and processed.
I currently have : (from here)
$handle = fopen(dirname(__FILE__)."/files/workorderstest.csv" , "r");
$batch++;
if ($handle) {
$counter = 0;
//instead of executing query one by one,
//let us prepare 1 SQL query that will insert all values from the batch
$sql ="INSERT INTO workorderstest(id,parentid,f1,f2,f3,f4,f5,f6,f7,f8,f9,f10) VALUES ";
while (($line = fgets($handle)) !== false) {
$sql .= "($line),";
$counter++;
}
$sql = substr($sql, 0, strlen($sql) - 1);
var_dump($sql);
if ($conn->query($sql) === TRUE) {
} else {
}
fclose($handle);
}
I want to keep the memory footprint to a minimum. I'm thinking this should just be a matter of keeping track of the pointer -> repeat until lines reached -> process sql -> start at pointer -> repeat until eof.
- Is
fgets()
the best to use for this? - Do I need to incorperate a callback or some such to defer the sql processing until all lines are read?
- I'm a bit lost on where to start as I am still learning PHP.
**** Updated Answered script below if it helps someone else ...
date_default_timezone_set('Australia/Brisbane');
$date = date('m/d/Y h:i:s a', time());
$timezone = date_default_timezone_get();
$time_start = microtime(true);
$batch_size = 500; // Lines to be read per batch
$batch = 0;
$counter = 0;
$lines = 0;
$conn = new mysqli($servername, $username, $password, $dbname);
if ($conn->connect_error) {
die("Connection failed: " . $conn->connect_error);
}
// Remove Existing Data from table
$sql = "TRUNCATE TABLE `workorderstest`";
$conn->query($sql);
$handle = fopen(dirname(__FILE__)."/files/workorders.csv" , "r");
//instead of executing query one by one,
//let us prepare 1 SQL query that will insert all values from the batch
$sql_prefix ="INSERT INTO workorderstest(id,parentid,f1,f2,f3,f4,f5,f6,f7,f8,f9,f10) VALUES ";
$values = "";
while (($line = fgets($handle)) !== false) {
$values .= "($line),";
$counter++;
$lines++;
if ($counter == $batch_size) {
$values = substr($values, 0, strlen($values) - 1);
$conn->query($sql_prefix . $values) or die($conn->error);
$counter = 0;
$values ="";
$batch++;
}
}
if ($counter > 0) { // Execute the last batch
$values = substr($values, 0, strlen($values) - 1);
$conn->query($sql_prefix . $values) or die($conn->error);
}
// Output results
$time_end = microtime(true);
$time = $time_end - $time_start;
echo "Importing Script running at: $date <br/>";
echo "Timezone: $timezone <br/>";
echo "<br/>";
echo "Script Summary:";
echo "Time running script: " . round($time,3) . " seconds <br/>";
echo "Memory: ".memory_get_usage() . " bytes <br/>";
echo "Records Updated: $lines <br/>";
echo "Batches run: $batch <br/>";
?>
Is
fgets()
the best to use for this? It's a fine way to do it. Another option is to read the entire file into an array withfile()
, then loop over the array withforeach()
.Do I need to incorperate a callback? No. Just perform the query after reading every batch of lines from the file.
Where to start? When the counter reaches the batch size, perform the query. Then set the counter back to
0
and set the query string back to the initial value. Finally, at the end of the loop you'll need to perform the query with the remaining values (unless the file size was an exact multiple of the batch size, in which case there won't be anything remaining).