I am having problems inserting information into a table in my SQLSERVER 2008 database from my python project using OPENROWSET BULK and even though the CSV file generates special characters such as Ñ, á, é, etc... When inserting it into the database these characters are corrupted.
This is where I create my CSV file:
dfClientes.to_csv(cvs_path, index=False, sep=';', encoding='utf-8-sig')
I run a shutil copy to move that file to a specific path
shutil.copy(cvs_path, ruta_destinoCSV)
and I run the OPENROWSET BULK:
bulk_insert_query = f'''INSERT INTO sgCliente (IdCliente,NomCliente,ApeCliente,NumDocumento,IdTipoDoc,NomTipoDoc,CodTipoDoc,Tip_docu,Sexo,FactorRh,GrupSangre,EstadoCivil,FecNacimiento,eMail,TelMovil,PrimerNombre,SegundoNombre,PrimerApellido,IndCentroExcelencia,IndDatosPersonal,IndHabilitado,IdTercero,CodUES)
SELECT * FROM OPENROWSET(
BULK '{ ruta_destinoCSV }',
FORMATFILE='{ ruta_destinoXML }',
CODEPAGE = 'UTF-8-SIG',
FIRSTROW = 2) as t
'''
This is the content of my XML fiel that I use as a formatfile:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="100" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="101" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="101" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="20" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="5" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="1" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="6" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="30" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="7" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="2" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="8" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="3" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="9" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="254" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="10" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="254" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="11" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="254" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="12" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="254" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="13" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="8"/>
<FIELD ID="14" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="50" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="15" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="15" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="16" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="50" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="17" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="50" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="18" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="50" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="19" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="50" />
<FIELD ID="20" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="50" />
<FIELD ID="21" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="50" />
<FIELD ID="22" xsi:type="CharTerm" TERMINATOR=';' MAX_LENGTH="50" />
<FIELD ID="23" xsi:type="CharTerm" TERMINATOR='\r\n' MAX_LENGTH="5" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="IdCliente" xsi:type="SQLINT"/>
<COLUMN SOURCE="2" NAME="NomCliente" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="3" NAME="ApeCliente" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="4" NAME="NumDocumento" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="5" NAME="IdTipoDoc" xsi:type="SQLTINYINT"/>
<COLUMN SOURCE="6" NAME="NomTipoDoc" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="7" NAME="CodTipoDoc" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="8" NAME="Tip_docu" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="9" NAME="Sexo" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="10" NAME="FactorRh" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="11" NAME="GrupSangre" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="12" NAME="EstadoCivil" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="13" NAME="FecNacimiento" xsi:type="SQLDATETIME"/>
<COLUMN SOURCE="14" NAME="eMail" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="15" NAME="TelMovil" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="16" NAME="PrimerNombre" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="17" NAME="SegundoNombre" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="18" NAME="PrimerApellido" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="19" NAME="IndCentroExcelencia" xsi:type="SQLBIT"/>
<COLUMN SOURCE="20" NAME="IndDatosPersonal" xsi:type="SQLBIT"/>
<COLUMN SOURCE="21" NAME="IndHabilitado" xsi:type="SQLBIT"/>
<COLUMN SOURCE="22" NAME="IdTercero" xsi:type="SQLINT"/>
<COLUMN SOURCE="23" NAME="CodUES" xsi:type="SQLVARYCHAR"/>
</ROW>
</BCPFORMAT>
The solution is very simple, thank you to @THOM A.
Sql server 2008 has no support for UTF-8 so i have to encoding in ANSI, so the solution is change this:
for this: