I have small service which parses some data from particular site. It was writed with Python 3.10 and Selenium 4.10.0. It works but after some time (5-15 minutes) it crashes.
The exception was like:
Traceback (most recent call last):
app | File "//app.py", line 199, in <module>
app | driver.get(link)
app | File "/usr/local/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 355, in get
app | self.execute(Command.GET, {"url": url})
app | File "/usr/local/lib/python3.10/site-packages/selenium/webdriver/remote/webdriver.py", line 346, in execute
app | self.error_handler.check_response(response)
app | File "/usr/local/lib/python3.10/site-packages/selenium/webdriver/remote/errorhandler.py", line 245, in check_response
app | raise exception_class(message, screen, stacktrace)
app | selenium.common.exceptions.InvalidSessionIdException: Message: invalid session id
app | Stacktrace:
app | #0 0x55ac01c904e3 <unknown>
app | #1 0x55ac019bfb00 <unknown>
app | #2 0x55ac019ef919 <unknown>
app | #3 0x55ac01a1af16 <unknown>
app | #4 0x55ac01a1717a <unknown>
app | #5 0x55ac01a168a6 <unknown>
app | #6 0x55ac0198f263 <unknown>
app | #7 0x55ac01c503e4 <unknown>
app | #8 0x55ac01c543d7 <unknown>
app | #9 0x55ac01c5eb20 <unknown>
app | #10 0x55ac01c55023 <unknown>
app | #11 0x55ac01c231aa <unknown>
app | #12 0x55ac0198da43 <unknown>
app | #13 0x7f341189d18a <unknown>
**But one day it also consisted an information about Chrome page crash **
After research I found some solutions:
- unknown error: session deleted because of page crash from unknown error: cannot determine loading status from tab crashed with ChromeDriver Selenium
- https://github.com/elgalu/docker-selenium/issues/20
- https://svdoscience.com/2021-03-17/fix-session-deleted-page-crash-selenium-grid-chrome-docker
I tried everything but it's still crashes.
There is a part from docker-compose file related to service with Selenium:
app:
# restart: always
build: ./app
links:
- postgres:postgres
secrets:
- postgres_user
- postgres_password
volumes:
- /dev/shm:/dev/shm
mem_limit: "2g"
mem_reservation: "512m"
shm_size: '2gb'
privileged: true
environment:
POSTGRES_DB: db
DATABASE_PORT: 5432
POSTGRES_USER_FILE: /run/secrets/user
POSTGRES_PASSWORD_FILE: /run/secrets/password
depends_on:
- postgres
command: >
sh -c "python3 app.py"
There is a Dockerfile for build app: Used https://nander.cc/using-selenium-within-a-docker-container
FROM python:3.10
COPY requirements.txt requirements.txt
RUN python -m pip install --upgrade pip && python -m pip install -r requirements.txt
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
RUN sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list'
RUN apt-get -y update
RUN apt-get install -y google-chrome-stable
RUN apt-get install -yqq unzip
RUN wget -O /tmp/chromedriver.zip http://chromedriver.storage.googleapis.com/`\
curl -sS chromedriver.storage.googleapis.com/LATEST_RELEASE\
`/chromedriver_linux64.zip
RUN unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/
ENV DISPLAY=:99
COPY . .
I tried to fix /dev/shm, change shm-size, add options to Chrome instance and almost all combination from that. I didn't check completely but it seems to me that without Docker it works without any problem. There are also Chrome Options:
def set_chrome_options() -> webdriver.ChromeOptions:
"""Sets chrome options for Selenium.
Chrome options for headless browser is enabled.
"""
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")
chrome_prefs = {}
chrome_options.experimental_options["prefs"] = chrome_prefs
chrome_prefs["profile.default_content_settings"] = {"images": 2}
return chrome_options
And, of course, I don't close a driver before scrabbing Everything was checked on MacOS and also on Linux Mint 21. Docker Version 23.0.2
[UPDATE] Just now I also checked another possible solutions like switching from chrome to chromium, adding another kind of options to Chrome (headless, --remote-debugging-port=9222), but it doesn't work too