Is it possible to define a HTML selector that concatenates multiple selectors and separates them by semicolon?

Question

Is it possible to define a HTML selector that concatenates multiple selectors and separates them by semicolon?

1.3k views Asked by Daniel At 04 January 2019 at 17:53

I'm trying to parse a simple HTML page with pup. This is a command-line HTML parser and it accepts general HTML selectors.

I want to select:

'div.aclass text{}' #(would be SampleA)

and I also want to select:

'div.bclass text{}' #(would be SampleB)

and I want to concatenate them and insert some custom text to get:

SampleA;MYEXTRASTRING;SampleB

I want to avoid calling pup more than once as it is slow.

I can select multiple tags:

'div.aclass text{}, div.bclass text{}'

but this will result:

SampleA
SampleB

Is there any better choice than pup for this purpose?

(Note: Python is NOT an option as it's very slow for my needs.)

Original Q&A

There are 1 answers

**Kevin Cui** · Answer 1 · 2019-01-04T21:35:53+00:00

Multiple selectors with pup seem not work, there is an issue here: https://github.com/ericchiang/pup/issues/59

To achieve your purpose, I would suggest to use hxselect command, which can be found inside HTML-XML-utils: https://www.w3.org/Tools/HTML-XML-utils/README

Example:

curl -s http://example.com/ | hxselect -c 'body > div:nth-child(1) > h1:nth-child(1)', 'body > div:nth-child(1) > p:nth-child(3) > a:nth-child(1)' -s ';MYEXTRASTRING;' | sed 's/\(.*\);MYEXTRASTRING;/\1/'

curl part:

curl is used to download html content of http://exmaple.com

hxselect part:

hxselect supports multiple CSS selectors. Use , to separate these selectors.

-c: print content only, without html tag

-s: separator text after each match. In your case, it's ;MYEXTRASTRING;

sed part:

Because -s separator text will be added for each match, it means it will be added twice. sed is used to remove the last matched separator text.

TechQA.

Is it possible to define a HTML selector that concatenates multiple selectors and separates them by semicolon?

There are 1 answers

Related Questions in HTML

Related Questions in BASH

Related Questions in PUP

Popular Questions

Trending Questions