Value1 Value1 Value1

Getting elements that have specific attributes by pyquery

764 views Asked by At

I have something like this in HTML page:

<ul>
    <li>
        <span data-name-en="data1">Value1</span>
        <span data-view-en="test1"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data2">Value2</span>
        <span data-view-en="test2"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data3">Value3</span>
        <span data-view-en="test3"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data4">Value4</span>
        <span data-view-en="test4"><span class="fa fa-gear"></span></span>
    </li>
</ul>

How I can get all elements that have data-name-en attribute?

2

There are 2 answers

1
Dmitry Erohin On
from bs4 import BeautifulSoup as bs

s = '''
<ul>
    <li>
        <span data-name-en="data1">Value1</span>
        <span data-view-en="test1"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data2">Value2</span>
        <span data-view-en="test2"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data3">Value3</span>
        <span data-view-en="test3"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data4">Value4</span>
        <span data-view-en="test4"><span class="fa fa-gear"></span></span>
    </li>
</ul>
'''

soup = bs(s, 'xml')
result = [x['data-name-en'] for x in soup('span') if x.has_attr('data-name-en')]

print(result)
0
Chalist On

I found correct answer:

s = '''
<ul>
    <li>
        <span data-name-en="data1">Value1</span>
        <span data-view-en="test1"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data2">Value2</span>
        <span data-view-en="test2"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data3">Value3</span>
        <span data-view-en="test3"><span class="fa fa-gear"></span></span>
    </li>
    <li>
        <span data-name-en="data4">Value4</span>
        <span data-view-en="test4"><span class="fa fa-gear"></span></span>
    </li>
</ul>
'''

html = PyQuery(s)
items = html.find('li span[data-name-en]')

and for getting attribute value, you need to do this:

pq(item).attr("data-name-en")