Delta live tables problem - The column name(s) 'Alpha Source' , 'Alpha source' are duplicated in dataset

30 views Asked by At

I am trying to load csv files from Datalake into delta tables but I am getting duplicate column name issue while spinning up the tables.

My CSV looks something like this -

Id  Alpha Source    Alpha source
1   AKH              null
2   AKG 

And I trying to load the table from abfss -

@dlt.table(comment="load csv files in bronze",
       name = "dev.bronze.logs",
       table_properties = {
           'delta.columnMapping.mode': 'name'
       })
def table():
 landing_zone_path = "abfss://[email protected]/log/"
 df = spark.readStream.format("cloudFiles") \
    .option("cloudFiles.format", "csv") \
    .option("header","true")\
    .option("inferSchema", "True")\
    .load(landing_zone_path)
 return df

Instead it fails - enter image description here

I would expect the additional columns to go into _rescue_data but thats not happening. I also tried using spark.conf.set('spark.sql.caseSensitive', True) but that dint work too.

0

There are 0 answers