How to avoid non-float values removed from DataTable with data from Excel file?

364 views Asked by At

I'm using the following code for get a DataTable variable with the information of a Sheet from a Excel file:

// Just a few examples about connectionString and Excel's file path:
string pathFile = @"C:\Windows\MyFolder\myExcelSample.xlsx";
string excelConnString = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + pathFile + ";Extended Properties='Excel 12.0;HDR=YES;IMEX=1;';";

using (OleDbConnection objConn = new OleDbConnection(cadenaConexion))
{
    objConn.Open();
    OleDbCommand cmd = new OleDbCommand();
    OleDbDataAdapter oleda = new OleDbDataAdapter();
    DataSet ds = new DataSet();
    DataTable dt = objConn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
    string sheetName = string.Empty;

    if (dt != null)
    {
        var tempDataTable = (from dataRow in dt.AsEnumerable()
                             where !dataRow["TABLE_NAME"].ToString().Contains("FilterDatabase")
                             select dataRow).CopyToDataTable();
        dt = tempDataTable;
        sheetName = dt.Rows[TABLE_ROW]["TABLE_NAME"].ToString();
    }

    cmd.Connection = objConn;
    cmd.CommandType = CommandType.Text;
    cmd.CommandText = "SELECT * FROM [" + sheetName + "]";
    oleda = new OleDbDataAdapter(cmd);
    oleda.Fill(ds, "Fact_TEMP");
    tbl_temporal = ds.Tables["Fact_TEMP"];
    objConn.Close();
}

The Excel file has a column called "Document No#" which this code says is float type, but, this column has values that are not float.

Here are a few examples:

444036
CO27_009734
CO31_050656
444041
444041
CO24_102377
CO64_000021
444043
CO24_102378
444044
444044
CO24_102380
CO24_102381
444046
444046444049
444050
CO24_102384

And the values that are not float-type are removed in the tbl_temporal variable.

Which other ways are for solve this situation that doesn't envolve user update the type of the column (which by default is General) in the Excel file?


A few information I have to share:

  • Excel file has a ".xlsx" extension and has 40340 rows.
  • Excel file cannot be modified. This Excel file is supplied by a user which just upload his/her Excel file to our System and our System has to solve the issues and it shouldn't modify the schema or its data.
2

There are 2 answers

0
Mauricio Arias Olave On BEST ANSWER

After following this comment of the accepted answer:

  • In your connection string, change the value of HDR=YES to HDR=NO.

I have change the way I get the Excel info in the DataTable variable for use the first row of the DataTable as the name of the columns in the Excel file.

This is the code I use for it:

// Add columns to "tbl_result" DataTable.
for (int colCount = 0; colCount < tbl_excel.Columns.Count; colCount++)
{
    tbl_result.Columns.Add(new DataColumn()
    {
        DataType = tbl_excel.Columns[colCount].DataType,
        ColumnName = tbl_excel.Rows[0][colCount].ToString(),
        AllowDBNull = true
    });
}

// Remove row "which is actually the header  in the Excel file".
tbl_excel.Rows.RemoveAt(0);

// Set the name of the table.
tbl_result.TableName = tbl_excel.TableName;

// Import rows.
foreach (DataRow row in tbl_excel.Rows)
{
    tbl_result.Rows.Add(row.ItemArray);
}

I have the check a few times the Excel file because I was getting this error:

The given value of type String from the data source cannot be converted to type float of the specified target column.

I created the table in the SQL Server Database using the "Import Data" feature using the Excel file, but, what I didn't know is that some columns in the Excel file has values that doesn't correspond with the data type of the columns migrated in the SQL Server table.

So, I changed these columns (which are the problematic ones):

-- [Document No#] was float before execute this line.
ALTER TABLE Fact_TEMP ALTER COLUMN [Document No#] NVARCHAR(255)

-- [G/L Account No#] was float before execute this line.
ALTER TABLE Fact_TEMP ALTER COLUMN [G/L Account No#] NVARCHAR(255)

And after try again uploading the Excel file (which has 40340 rows), the upload worked without any problem.


The TL;DR version is:

  • Change the value of HDR=YES to HDR=NO in your connection string.
  • Check the values of the Excel file for check whether contains invalid data (i. e. a float column which has NULL or other DataType values).
  • Check that the DataType of the SQL Server Database table has the same DataType expected.
3
Trisped On

Try putting an apostrophe ' in front of the numeric values or putting one of the string values as the first line (after the column headers).

You could also switch from OLEDB to one of the XML Excel file readers from NuGet.