i have below mount point created
dbutils.fs.ls("/mnt/mount/raw2")
and have below file available in the above mount point
customer1.csv
| ID| Name| Place|
+---+------+------+
|101| Hari| Tcr|
|102| John| Bgr|
now i am reading the file into the dataframe
df = spark.read.option("header", "true").csv("/mnt/mount/raw2")
this read the customer1.csv file
now i am writing this into the delta table
df.write.format("delta").mode("overwrite").save("/mnt/mount/raw2/customer_data")
then i recieve new data in the mount point
customer2.csv with
103|Stefen| Hyd|
|104| Devid| Bgr|
|105| Wager|London|
now i want to append the data into the same delta location - customer_data
so what is the best way to read the new arrived file from the mount location dynamically from the same mount point
so i am looking for scenario as below
existing_delta_table =
| ID| Name| Place|
+---+------+------+
|101| Hari| Tcr|
|102| John| Bgr|
new arrived file
df = spark.read.option("header", "true").csv("/mnt/mount/raw2")
if df is new_file:
existing_delta_table.append(df_records)
You can use
Auto loader
concept. Follow below steps.I have used
cutomer1.csv
.Next, run below code.
Here you need to give delta table path different from source.
Now, the data in delta table is.
Next, adding
cutomer2.csv
now.Again, run above code. Which checks for new file and update the records.
and in table.