I use OpenXML SDK 2.0 to generate Excel file with large amount of data, appox. 1000000 rows, and I need to optimize memory usage because my machine slows down very quickly.
I want to solve this issue by flushing part of generated DOM tree into file during runtime. I make my own buffering for data. E.g I have 100000 records to write and I want flush stream into file when I add 1000 rows into Excel worksheet. I make this by using method worksheetPart.Worksheet.Save(). Documantation says that this method Save(): "saves the data in the DOM tree back to the part. It could be called multiple times as well. Each time it is called, the stream will be flushed."
foreach (Record m in dataList)
{
Row contentRow = CreateContentRow(index, m); // my own method to create row content
//Append new row to sheet data.
sheetData.AppendChild(contentRow);
if (index % BufferSize == 0)
{
worksheetPart.Worksheet.Save();
}
index++;
}
This method works because memory usage chart has saw shape but unfortunetly the memory uasge grows in time.
Do anyone have any idea how solve this issue?
SpreadsheetGear for .NET can create an xlsx workbook with 1,000,000 rows by 40 columns of random numbers (that's 40 million cells) in 74 seconds (that includes creating the workbook in memory from random numbers and saving to disk on an overclocked Intel QX 6850 and Windows Vista 32).
What kind of performance are you seeing with the Open XML SDK?
You can download a free trial of SpreadsheetGear here and try it yourself.
I will past the code to generate the 40 million cell workbook below.
Disclaimer: I own SpreadsheetGear LLC