Can't get progress bar to work in python rich

8.6k views Asked by At

i'm trying to add a progress bar with rich to my code. However, while the code is running, the bar only updates to 100% after it's finished. Can I have any help? My code:

theme = Theme({'success': 'bold green',
              'error': 'bold red', 'enter': 'bold blue'})
console = Console(theme=(theme))
for i in track(range(1), description='Scraping'):
    global pfp
    global target_id
    chrome_options = Options()
    chrome_options.add_argument("--headless")
    driver = webdriver.Chrome(options=chrome_options)
    begining_of_url = "https://lookup.guru/"
    whole_url = begining_of_url + str(target_id)
    driver.get(whole_url)
    wait = WebDriverWait(driver, 10)
    wait.until(EC.visibility_of_element_located((By.XPATH, "//img")))
    images = driver.find_elements_by_tag_name('img')
    for image in images:
        global pfp
        pfp = (image.get_attribute('src'))
        break
    if pfp == "a":
        console.print("User not found \n", style='error')
        userInput()
    img_data = requests.get(pfp).content
    with open('pfpimage.png', 'wb') as handler:
        handler.write(img_data)
    filePath = "pfpimage.png"
    searchUrl = 'https://yandex.com/images/search'
    files = {'upfile': ('blob', open(filePath, 'rb'), 'image/jpeg')}
    params = {'rpt': 'imageview', 'format': 'json',
              'request': '{"blocks":[{"block":"b-page_type_search-by-image__link"}]}'}
    response = requests.post(searchUrl, params=params, files=files)
    query_string = json.loads(response.content)[
                              'blocks'][0]['params']['url']
    img_search_url = searchUrl + '?' + query_string
    webbrowser.open(whole_url)
    webbrowser.open(img_search_url)
    console.print("Done!", style='success')

Edit: For more clarity, I want the progressbar to update as it goes through each part of my code. There is only one url to scrape. For example it would start at 0%, and after global pfp the bar would change to x%

Thanks for any help :)

2

There are 2 answers

1
Thomas On BEST ANSWER

The problem was that through the use of for i in track(range(1), description='Scraping'): the bar would only go to 100% when the loop had finished. By changing the range() value would make the code loop and would update the bar. To fix this issue I used another rich module called Progress.

By importing Progress and then modifying the code on the Rich Documentation I got:

from rich.progress import Progress
import time

with Progress() as progress:

    task1 = progress.add_task("[red]Scraping", total=100)

    while not progress.finished:
        progress.update(task1, advance=0.5)
        time.sleep(0.5)

Essentially:

  • At task1 = progress.add_task("[red]Scraping", total=100) a bar is created with a maximum value of 100
  • The code indented underwhile not progress.finished: will loop until the bar is at 100%
  • At progress.update(task1, advance=0.5) the bar's total will be increased by a value of 0.5.

Therefore, for my specific example, my end result code was:

theme = Theme({'success': 'bold green',
                  'error': 'bold red', 'enter': 'bold blue'})
console = Console(theme=(theme))
bartotal = 100

with Progress() as progress:
    task1 = progress.add_task("[magenta bold]Scraping...", total=bartotal)
    while not progress.finished:
                console.print("\nDeclaring global variables", style='success')
                global pfp
                progress.update(task1, advance=4)
                global target_id
                progress.update(task1, advance=4)
                console.print("\nSetting up Chrome driver", style='success')
                chrome_options = Options()
                progress.update(task1, advance=4)
                chrome_options.add_argument("--headless")
                progress.update(task1, advance=4)
                driver = webdriver.Chrome(options=chrome_options)
                progress.update(task1, advance=4)
                console.print("\nCreating url for lookup.guru",
                              style='success')
                begining_of_url = "https://lookup.guru/"
                progress.update(task1, advance=4)
                whole_url = begining_of_url + str(target_id)
                progress.update(task1, advance=4)
                driver.get(whole_url)
                progress.update(task1, advance=4)
                console.print(
                    "\nWaiting up to 10 seconds for lookup.guru to load", style='success')
                wait = WebDriverWait(driver, 10)
                progress.update(task1, advance=4)
                wait.until(EC.visibility_of_element_located(
                    (By.XPATH, "//img")))
                progress.update(task1, advance=4)
                console.print("\nScraping images", style='success')
                images = driver.find_elements_by_tag_name('img')
                progress.update(task1, advance=4)
                for image in images:
                    global pfp
                    pfp = (image.get_attribute('src'))
                    break
                progress.update(task1, advance=4)
                if pfp == "a":
                    console.print("User not found \n", style='error')
                    userInput()
                progress.update(task1, advance=4)
                console.print(
                    "\nDownloading image to current directory", style='success')
                img_data = requests.get(pfp).content
                progress.update(task1, advance=4)
                with open('pfpimage.png', 'wb') as handler:
                    handler.write(img_data)
                progress.update(task1, advance=4)
                filePath = "pfpimage.png"
                progress.update(task1, advance=4)
                console.print("\nUploading to yandex.com", style='success')
                searchUrl = 'https://yandex.com/images/search'
                progress.update(task1, advance=4)
                files = {'upfile': ('blob', open(
                    filePath, 'rb'), 'image/jpeg')}
                progress.update(task1, advance=4)
                params = {'rpt': 'imageview', 'format': 'json',
                          'request': '{"blocks":[{"block":"b-page_type_search-by-image__link"}]}'}
                progress.update(task1, advance=4)
                response = requests.post(searchUrl, params=params, files=files)
                progress.update(task1, advance=4)
                query_string = json.loads(response.content)[
                                          'blocks'][0]['params']['url']
                progress.update(task1, advance=4)
                img_search_url = searchUrl + '?' + query_string
                progress.update(task1, advance=4)
                console.print("\nOpening lookup.guru", style='success')
                webbrowser.open(whole_url)
                progress.update(task1, advance=4)
                console.print("\nOpening yandex images", style='success')
                webbrowser.open(img_search_url)
                progress.update(task1, advance=4)
                console.print("\nDone!", style='success')
                progress.update(task1, advance=4)
1
Will McGugan On

In order to show a progress bar, Rich needs to know how may steps are involved and when you finish a step. The track function can get this information automatically from a sequence. You're using this in your example, but your sequence only has a single element so you go from 0 to 100% in a single step.

If you want to track progress of something you need a sequence that defines the work to be done. For instance if you had a list of urls to scrape, you might do something like this:

from rich.progress import track
SCRAPE_URLS = ["https://example.org", "https://google.org", ...]
for url in track(SCRAPE_URLS):
    scrape(url)

The progress bar will advance for every URL.