Progress bar when copying one large file in Python?

122 views Asked by At

I want to use tqdm to display a progress bar when copying a single large file from one filepath to another.

This is not the same as showing a progress bar when copying multiple small files.

1

There are 1 answers

2
David Foster On

Strategy:

  1. Determine the length of the file first so that you can tell tqdm what it is.

  2. Normally then you'd use shutil.copyfile (or shutil.copyfileobj) to actually copy the file but neither of those functions provide a way to receive updates during the copy, which we need to forward to tqdm:

    • So I created a slightly modified version of copyfileobj which supports passing an update function which is called regularly during copying.
    • Then I passed the tqdm's update method as the update function to the new copyfileobj so that it gets called regularly during the copy.

Code:

import os
from tqdm import tqdm

def copy_with_progress(source_filepath, target_filepath):
    with open(source_filepath, 'rb') as source_file:
        with open(target_filepath, 'wb') as target_file:
            source_file.seek(0, os.SEEK_END)
            file_length = source_file.tell()
            
            print(f'Copying: {source_filepath} ({file_length:n} bytes)')
            source_file.seek(0, os.SEEK_SET)
            with tqdm(total=file_length, unit='B', unit_scale=True) as progress_bar:
                copyfileobj_with_progress(
                    source_file, target_file,
                    progress_func=progress_bar.update)

# Based on shutil.copyfileobj 
_WINDOWS = os.name == 'nt'
COPY_BUFSIZE = 1024 * 1024 if _WINDOWS else 64 * 1024
def copyfileobj_with_progress(fsrc, fdst, length=0, progress_func=None):
    if not length:
        length = COPY_BUFSIZE
    if progress_func is None:
        progress_func = lambda offset: None
    
    # Localize variable access to minimize overhead.
    fsrc_read = fsrc.read
    fdst_write = fdst.write
    
    while True:
        buf = fsrc_read(length)
        if not buf:
            break
        fdst_write(buf)
        progress_func(len(buf))  # delta offset