Extracting links with scrapy that have a specific css class

776 views Asked by At

Conceptually simple question/idea.

Using Scrapy, how to I use use LinkExtractor that extracts on only follows links with a given CSS?

Seems trivial and like it should already be built in, but I don't see it? Is it?

It looks like I can use an XPath, but I'd prefer using CSS selectors. It seems like they are not supported?

Do I have to write a custom LinkExtractor to use CSS selectors?

1

There are 1 answers

2
alecxe On BEST ANSWER

From what I understand, you want something similar to restrict_xpaths, but provide a CSS selector instead of an XPath expression.

This is actually a built-in feature in Scrapy 1.0 (currently in a release candidate state), the argument is called restrict_css:

restrict_css

a CSS selector (or list of selectors) which defines regions inside the response where links should be extracted from. Has the same behaviour as restrict_xpaths.

The initial feature request: