ACE.OLEDB provider incorrectly reads some column names

2.5k views Asked by At

I'm reading an excel file from a C# app using ACE.OLEDB. Everything has worked fine so far until today I noticed that one of the column names is read incorrectly.

Here's what I've got in my excel file

Here's what I've got in my excel file

Here's what I've got in my debugger

And here's what is read in the code

Basically the dot (".") is replaced with a hash ("#") for some reason.

The code is straightforward and works fine for the most part, don't think the problem is there, but will show it here just for the sake of clarity.

DataTable data = new DataTable();    
string strAccessConn = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + path.FullName + ";Extended Properties=\"Excel 12.0\";\"HDR=YES\";\"IMEX=1;\"";     
OleDbConnection myAccessConn = new OleDbConnection(strAccessConn);
string strAccessSelect = "select  * from [" + SheetName + "];";
OleDbCommand myAccessCommand = new OleDbCommand(strAccessSelect, myAccessConn);
OleDbDataAdapter myDataAdapter = new OleDbDataAdapter(myAccessCommand);                           
myDataAdapter.Fill(0, maxRows, data);  

Is there something I'm doing wrong or is it a bug in OLEDB?

2

There are 2 answers

0
mjsqu On

I came here looking for a reference of column name transformations for special characters taken from header rows in Excel. Using SQL like this:

select * from
OPENROWSET(
'Microsoft.ACE.OLEDB.12.0','Excel 12.0;
Database=<INSERT-FILENAME>;
Extended Properties=Excel 12.0;IMEX=1;'
,'select * From [Sheet1$]')

Here are the transformations I've seen:

Excel | SQL
-----------------------
[     | (
]     | )
.     | #
!     | _
`     | _

This isn't a comprehensive list, but it may help someone. I've left the answer as a community wiki so that it can be freely edited.

0
Mehrad On

This issue only happens in the Header row and it's due to the fact that it prevents dots (.) from showing in the Excel file's header. Dot's in the header for Excel file (when translated to XML) can cause some issues. Despite the fact that there is no significance for . in XML, how Excel serialize back and forth must have extra conventions (not 100% on this). So while entering data in you can simply put # and it get's translated to . in the header in excel and when you're outputting a header with a . it will show up as # instead.

To go around this issue you can simply turn your headers of by changing your connection string header syntax to HDR=No. When data get in not as headers but normal rows then can be easily handled as float numbers and . won't cause any issues.