Using Playwright for Python, how do I select (or find) an element?

31.4k views Asked by At

I'm trying to learn the Python version of Playwright. See here

I would like to learn how to locate an element, so that I can do things with it. Like printing the inner HTML, clicking on it and such.

The example below loads a page and prints the HTML

from playwright import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    page = browser.newPage()
    page.goto('http://whatsmyuseragent.org/')
    print(page.innerHTML("*"))
    browser.close()

This page contains an element

<div class="user-agent">
    <p class="intro-text">Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4238.0 Safari/537.36</p>
</div>

Using Selenium, I could locate the element and print it's content like this

elem = driver.find_element_by_class_name("user-agent")
print(elem)
print(elem.get_attribute("innerHTML"))

How can I do the same in Playwright?

#UPDATE# - Note if you want to run this in 2021+ that current versions of playwright have changed the syntax from CamelCase to snake_case.

5

There are 5 answers

3
hardkoded On BEST ANSWER

You can use the querySelector function, and then call the innerHTML function:

handle = page.querySelector(".user-agent")
print(handle.innerHTML())
0
freeboy2099 On

I think you can find the solutions in the following article. Playwright >> Find, Locate, Select Elements/Tags using Playwright

  • Playwright find all elements/tags containing specified text
  • Playwright find elements/tags containing specified child element/tag
  • Playwright loop through all elements/tags in locator() result
  • Playwright find/get first element Playwright find/get last element
  • Playwright get the parent element Playwright get the child element
  • Playwright get nth child element Playwright find elements/tags by css
  • class Playwright find elements near the specified text Playwright
  • find elements/tags by attribute Playwright find elements/tags by id
2
Upendra On

The accepted answer does not work with the newer versions of Playwright. (Thanks @576i for pointing this out)

Here is the Python code that works with the newer versions (tested with version 1.5):

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto('http://whatsmyuseragent.org/')
    ua = page.query_selector(".user-agent");
    print(ua.inner_html())
    browser.close()

To get only the text, use the inner_text() function.

print(ua.inner_text())
0
crifan On

according to Latest official python version Playwright, you should use:

-> the code:

# userAgentSelector = ".user-agent"
userAgentSelector = "div.user-agent"
elementHandle = page.query_selector(userAgentSelector)
uaHtml = elementHandle.inner_html()
print("uaHtml=%s" % uaHtml)
0
ggorlen On

Existing answers are a bit outdated. Nowadays the locator API is recommended since auto-waiting is the common case:

from playwright.sync_api import sync_playwright  # 1.37.0

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("https://www.example.com")
    text = page.locator("h1").text_content()
    print(text)
    browser.close()

Use query_selector when you don't want to wait and instead want an immediate None if the element isn't visible.

Note that http://whatsmyuseragent.org is down, so I used a different site, but it's basically the same.