Bash + Pup printing only attribute

260 views Asked by At

I'm wgeting a webpage src code then using pup to grab the <meta> tag that I need. Now I want to print only the value of the content field.

In this case, the output I want is: https://example.com/my/folder/first.jpg?foo=bar

# wget page to /tmp/output.html
IMAGE_URL=$(cat /tmp/output.html | pup 'meta[property*="og:image"]')
echo $IMAGE_URL is:
<meta property="og:image" content="https://example.com/my/folder/first.jpg?foo=bar">
2

There are 2 answers

0
John Smith On BEST ANSWER
wget -O /tmp/output.html --user-agent="user-agent: Whatever..." https://example.com/somewhere
IMAGE_URL=$(cat /tmp/output.html | pup --plain 'meta[property*="og:image"]' | sed -n 's/.*content=\"\([^"]*\)".*/\1/p')
0
Antoine On

You can use attr{content} to only get the content of the attribute.

wget -O /tmp/output.html --user-agent="user-agent: Whatever..." https://example.com/somewhere
IMAGE_URL=$(cat /tmp/output.html | pup 'meta[property*="og:image"] attr{content}'