Efficient Large File Upload(500GB) to Azure Blob Storage from ReactJS UI with FastAPI/Flask Backend

350 views Asked by At

I'm working on a project where I need to upload very large files (approximately 500GB) from a ReactJS front-end to Azure Blob Storage, using a Python-based FastAPI or Flask backend. My goal is to optimize the upload process to make it as efficient as possible.

I'm aware that chunking the file and making multiple calls to the backend is a common approach for large file uploads, but it can be time-consuming, especially given the large file size.

I'd like to achieve the following:

  1. Upload the 500GB file in less than 20 minutes.
  2. Take into account a stable internet connection with a speed of around 100Mbps.

What are the best practices and tools I should consider to achieve this goal efficiently? Are there any specific libraries or methods that can help with optimizing the upload process? Additionally, are there any configuration or settings I should be aware of in Azure Blob Storage for such large uploads?

I'm open to any suggestions or recommendations to streamline this large file upload process effectively. Thank you for your insights!

2

There are 2 answers

1
Sampath On

You can try the below method to improve your blob upload performance:-

File upload form with an input field for selecting a file and a button to trigger the upload. Sample code for a combination of a FastAPI backend and a React frontend for uploading to Azure Blob Storage.

React frontend:

File upload form with an input field for selecting a file and a button to trigger the upload.

function  App()  {

const  [selectedFile,  setSelectedFile] = useState(null);

  

const  handleFileChange = (event)  =>  {

setSelectedFile(event.target.files[0]);

};

  

const  handleFileUpload = async  ()  =>  {

const  formData = new  FormData();

formData.append('file',  selectedFile);

  

try  {

const  response = await  axios.post('http://localhost:8000/upload/',  formData,  {

headers:  {

'Content-Type':  'multipart/form-data',

},

});

console.log(response.data);

}  catch (error) {

console.error('File upload failed:',  error);

}

};

  

return (

<div>

<h1>File Upload</h1>

<input  type="file"  onChange={handleFileChange}  />

<button  onClick={handleFileUpload}>Upload File</button>

</div>

);

}

enter image description here

FastAPI Backend:

The /upload/ endpoint allows the client to upload a file to Azure Blob Storage. It accepts a file upload using the UploadFile type.

app = FastAPI()

# Allow CORS for your front-end application
app.add_middleware(
    CORSMiddleware,
    allow_origins=["http://localhost:3000"],  # Update with your front-end URL
    allow_methods=["*"],
    allow_headers=["*"],
)

# Initialize the Azure Blob Storage client
azure_blob_service = BlobServiceClient.from_connection_string("BlodConnectionString")

# Function to check the connection to Azure Blob Storage
def is_azure_connected():
    try:
        azure_blob_service.get_service_properties()
        return True
    except Exception:
        return False

@app.post("/upload/")
async def upload_file(file: UploadFile):
    container_client = azure_blob_service.get_container_client("ContainerName")

    try:
        blob_client = container_client.get_blob_client(file.filename)

        # Upload the file in chunks
        with file.file as f:
            chunk_size = 500 * 1024 * 1024  
            total_length = 0
            chunks = []
            while True:
                chunk = f.read(chunk_size)
                if not chunk:
                    break
                chunks.append(chunk)
                total_length += len(chunk)

            for i, chunk in enumerate(chunks):
                blob_client.upload_blob(data=chunk, blob_type=BlobType.BlockBlob, length=total_length)

        # Fetch the blob properties to get the file size
        blob_properties = blob_client.get_blob_properties()
        file_size = blob_properties['size']

        message = f"File uploaded successfully. File size: {file_size} bytes"
        print("File uploaded")
    except Exception as e:
        message = "File upload failed"
        print(f"File upload failed: {str(e)}")

    response_data = {"message": message}

    return JSONResponse(content=response_data)

@app.get("/list_files/")
async def list_files():
    container_client = azure_blob_service.get_container_client("sampathpujari")
    blob_list = container_client.list_blobs()

    file_list = [{"file_name": blob.name, "file_size": blob.size} for blob in blob_list]

    return JSONResponse(content=file_list)

if __name__ == "__main__":
    if is_azure_connected():
        print("Connected to Azure Blob Storage")
    else:
        print("Connection to Azure Blob Storage failed")

enter image description here enter image description here

enter image description here

0
Bhavesh Kumar Sharma On

I was trying to solve a similar problem, the problem is that azure does not allow Content-Type: multipart/form-data to upload files, try this StackOverflow Answers

  1. Answer 1

  2. Answer 2