with open to read json not working on databricks

491 views Asked by At

I was creating a function to write to MongoDB atlas, and I could not open the json file from the dbfs/FileStore. I did research on this but it seems like it is a community edition issue and none of the examples I found worked. I was wondering if there is alternative way to open the json file that I uploaded to dbfs/FileStore.

The code I have is

def set_mongo_collection(user_id, pwd, cluster_name, db_name, src_file_path, json_files):
    '''Create a client connection to MongoDB'''
    #mongo_url = f"mongodb+srv://{user_id}:{pwd}@{cluster_name}.zibbf.mongodb.net/{db_name}?retryWrties=true&w=majority"
    mongo_url = f"mongodb://{user_id}:{pwd}@sandbox-shard-00-00.4xoxr.mongodb.net:27017,sandbox-shard-00-01.4xoxr.mongodb.net:27017,sandbox-shard-00-02.4xoxr.mongodb.net:27017/{db_name}?ssl=true&replicaSet=atlas-72rhuk-shard-0&authSource=admin&retryWrites=true&w=majority"
    client = pymongo.MongoClient(mongo_url)
    db = client[db_name]
    
    '''Read in a JSON file, and Use it to Create a New Collection'''
    for file in json_files:
        db.drop_collection(file)
        json_file = os.path.join(src_file_path, json_files[file])
        with open(json_file, 'r') as openfile:
            json_object = json.load(openfile)
            file = db[file]
            result = file.insert_many(json_object)
    
    client.close()
    
    return result

Everything runs fine but when I get to with open(json_file, 'r') as openfile it returns an error.

Error message:

FileNotFoundError: [Errno 2] No such file or directory: '/FileStore/DS3002_Final/company.json'
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<command-3071759576543725> in <module>
      3 json_files = {"company" : "company.json"}
      4 
----> 5 set_mongo_collection(atlas_user_name, atlas_password, atlas_cluster_name, src_dbname, src_dir, json_files)

<command-3071759576543717> in set_mongo_collection(user_id, pwd, cluster_name, db_name, src_file_path, json_files)
     19         db.drop_collection(file)
     20         json_file = os.path.join(src_file_path, json_files[file])
---> 21         with open(json_file, 'r') as openfile:
     22             json_object = json.load(openfile)
     23             file = db[file]

FileNotFoundError: [Errno 2] No such file or directory: '/FileStore/DS3002_Final/company.json'

I tried using dbutils.fs.cp() and dbutils.fs.put() before running with open(json_file, 'r') as openfile: but it did not help.

I also tried to read that into spark.read.json but it was also not working...

Can anyone please help me to fix no such file exist error or alternative ways I can write the function?

Thank you.

0

There are 0 answers