Using ExcelDataReader to read Excel data starting from a particular cell

206.1k views Asked by At

I am using ExcelDataReader to read data from my Excel workbook in C#.
But structure of my Excel sheet is such that data to be read can start from any particular cell and not necessarily A1.

Can any one Please suggest a way on how this can be achieved using ExcelDataReader?

7

There are 7 answers

10
shA.t On

If you are using ExcelDataReader 3+ you will find that there isn't any method for AsDataSet() for your reader object, You need to also install another package for ExcelDataReader.DataSet, then you can use the AsDataSet() method.
Also there is not a property for IsFirstRowAsColumnNames instead you need to set it inside of ExcelDataSetConfiguration.

Example:

using (var stream = File.Open(originalFileName, FileMode.Open, FileAccess.Read))
{
    IExcelDataReader reader;

    // Create Reader - old until 3.4+
    ////var file = new FileInfo(originalFileName);
    ////if (file.Extension.Equals(".xls"))
    ////    reader = ExcelDataReader.ExcelReaderFactory.CreateBinaryReader(stream);
    ////else if (file.Extension.Equals(".xlsx"))
    ////    reader = ExcelDataReader.ExcelReaderFactory.CreateOpenXmlReader(stream);
    ////else
    ////    throw new Exception("Invalid FileName");
    // Or in 3.4+ you can only call this:
    reader = ExcelDataReader.ExcelReaderFactory.CreateReader(stream)

    //// reader.IsFirstRowAsColumnNames
    var conf = new ExcelDataSetConfiguration
    {
        ConfigureDataTable = _ => new ExcelDataTableConfiguration
        {
            UseHeaderRow = true 
        }
    };

    var dataSet = reader.AsDataSet(conf);

    // Now you can get data from each sheet by its index or its "name"
    var dataTable = dataSet.Tables[0];

    //...
}

You can find row number and column number of a cell reference like this:

var cellStr = "AB2"; // var cellStr = "A1";
var match = Regex.Match(cellStr, @"(?<col>[A-Z]+)(?<row>\d+)");
var colStr = match.Groups["col"].ToString();
var col = colStr.Select((t, i) => (colStr[i] - 64) * Math.Pow(26, colStr.Length - i - 1)).Sum();
var row = int.Parse(match.Groups["row"].ToString());

Now you can use some loops to read data from that cell like this:

for (var i = row; i < dataTable.Rows.Count; i++)
{
    for (var j = col; j < dataTable.Columns.Count; j++)
    {
        var data = dataTable.Rows[i][j];
    }
}

Update:

You can filter rows and columns of your Excel sheet at read time with this config:

var i = 0;
var conf = new ExcelDataSetConfiguration
{
    UseColumnDataType = true,
    ConfigureDataTable = _ => new ExcelDataTableConfiguration
    {
        FilterRow = rowReader => fromRow <= ++i - 1,
        FilterColumn = (rowReader, colIndex) => fromCol <= colIndex,
        UseHeaderRow = true
    }
};
1
sathish venu On
public static DataTable ConvertExcelToDataTable(string filePath, bool isXlsx = false)
{
    System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);
    //open file and returns as Stream
        using (var stream = File.Open(filePath, FileMode.Open, FileAccess.Read))
        {
                using (var reader = ExcelReaderFactory.CreateReader(stream))
                {

                    var conf = new ExcelDataSetConfiguration
                    {
                        ConfigureDataTable = _ => new ExcelDataTableConfiguration
                        {
                            UseHeaderRow = true
                        }
                    };

                    var dataSet = reader.AsDataSet(conf);

                    // Now you can get data from each sheet by its index or its "name"
                    var dataTable = dataSet.Tables[0];

                    Console.WriteLine("Total no of rows  " + dataTable.Rows.Count);
                    Console.WriteLine("Total no of Columns  " + dataTable.Columns.Count);

                    return dataTable;

                }

        }
   
}
0
Vikas On

For ExcelDataReader v3.6.0 and above. I struggled a bit to iterate over the Rows. So here's a little more to the above code. Hope it helps for few atleast.

using (var stream = System.IO.File.Open(copyPath, FileMode.Open, FileAccess.Read))
                    {

                        IExcelDataReader excelDataReader = ExcelDataReader.ExcelReaderFactory.CreateReader(stream);

                        var conf = new ExcelDataSetConfiguration()
                        {
                            ConfigureDataTable = a => new ExcelDataTableConfiguration
                            {
                                UseHeaderRow = true
                            }
                        };

                        DataSet dataSet = excelDataReader.AsDataSet(conf);
                        //DataTable dataTable = dataSet.Tables["Sheet1"];
                        DataRowCollection row = dataSet.Tables["Sheet1"].Rows;
                        //DataColumnCollection col = dataSet.Tables["Sheet1"].Columns;

                        List<object> rowDataList = null;
                        List<object> allRowsList = new List<object>();
                        foreach (DataRow item in row)
                        {
                            rowDataList = item.ItemArray.ToList(); //list of each rows
                            allRowsList.Add(rowDataList); //adding the above list of each row to another list
                        }

                    }
1
Sievajet On

One way to do it :

FileStream stream = File.Open(@"c:\working\test.xls", FileMode.Open, FileAccess.Read);

IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(stream);

excelReader.IsFirstRowAsColumnNames = true;

DataSet result = excelReader.AsDataSet();

The result.Tables contains the sheets and the result.tables[0].Rows contains the cell rows.

0
MarlinG On

I found this useful to read from a specific column and row:

FileStream stream = File.Open(@"C:\Users\Desktop\ExcelDataReader.xlsx", FileMode.Open, FileAccess.Read);
IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
DataSet result = excelReader.AsDataSet();
excelReader.IsFirstRowAsColumnNames = true;         
DataTable dt = result.Tables[0];
string text = dt.Rows[1][0].ToString();
0
VnDevil On

Very easy with ExcelReaderFactory 3.1 and up:

using (var openFileDialog1 = new OpenFileDialog { Filter = "Excel Workbook|*.xls;*.xlsx;*.xlsm", ValidateNames = true })
{
    if (openFileDialog1.ShowDialog() == DialogResult.OK)
    {
        var fs = File.Open(openFileDialog1.FileName, FileMode.Open, FileAccess.Read);
        var reader = ExcelReaderFactory.CreateBinaryReader(fs);
        var dataSet = reader.AsDataSet(new ExcelDataSetConfiguration
        {
            ConfigureDataTable = _ => new ExcelDataTableConfiguration
            {
                UseHeaderRow = true // Use first row is ColumnName here :D
            }
        });
        if (dataSet.Tables.Count > 0)
        {
            var dtData = dataSet.Tables[0];
            // Do Something
        }
    }
}
2
cesAR On

To be more clear, I will begin at the beginning.

I will rely on the sample code found in https://github.com/ExcelDataReader/ExcelDataReader, but with some modifications to avoid inconveniences.

The following code detects the file format, either xls or xlsx.

FileStream stream = File.Open(filePath, FileMode.Open, FileAccess.Read);
IExcelDataReader excelReader;

//1. Reading Excel file
if (Path.GetExtension(filePath).ToUpper() == ".XLS")
{
    //1.1 Reading from a binary Excel file ('97-2003 format; *.xls)
    excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
}
else
{
    //1.2 Reading from a OpenXml Excel file (2007 format; *.xlsx)
    excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
}

//2. DataSet - The result of each spreadsheet will be created in the result.Tables
DataSet result = excelReader.AsDataSet();

//3. DataSet - Create column names from first row
excelReader.IsFirstRowAsColumnNames = false;

Now we can access the file contents in a more convenient way. I use DataTable for this. The following is an example to access a specific cell, and print its value in the console:

DataTable dt = result.Tables[0];
Console.WriteLine(dt.Rows[rowPosition][columnPosition]);

If you do not want to do a DataTable, you can do the same as follows:

Console.WriteLine(result.Tables[0].Rows[rowPosition][columnPosition]);

It is important not try to read beyond the limits of the table, for this you can see the number of rows and columns as follows:

Console.WriteLine(result.Tables[0].Rows.Count);
Console.WriteLine(result.Tables[0].Columns.Count);

Finally, when you're done, you should close the reader and free resources:

//5. Free resources (IExcelDataReader is IDisposable)
excelReader.Close();

I hope you find it useful.

(I understand that the question is old, but I make this contribution to enhance the knowledge base, because there is little material about particular implementations of this library).