Linked Questions

Popular Questions

So I'm pretty new to Python and I'm trying to make a scrapy crawler to extract distributor data from a site. But I'm not getting the results I expected. This is my code:

class QuotesSpider(scrapy.Spider):
    name = "final_url"

    def start_requests(self):
        urls = [
       "https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/dealerslist/almagro/2675585174/?countrySelectorCode=AR"
        ]


        for url in urls:
             yield scrapy.Request(url=url, callback=self.parse)


    def parse(self, response):

         urls_ = []
         for item in response.css('div.row.m-dealer_list__row'):

             half_urls_ = item.css('div.m-dealer_list__addr       a.link.trackingElement::attr(href)').getall()

            for half in half_urls_:
                 urls_.append('https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/' + half)

                with open('sub_urls.txt', 'a') as doc:
                    doc.write(str(urls_))

I expected a link (href) to each distributor -5 in this case- where I can extract name, address, mail, phone and site. Instead I get this confusing result:

['https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/localizador-de-distribuidores/distribuidor/boschla00077/almagro/colombo-fernando-javier/?countrySelectorCode=AR']
['https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/localizador-de-distribuidores/distribuidor/boschla00077/almagro/colombo-fernando-javier/?countrySelectorCode=AR', 
'https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/localizador-de-distribuidores/distribuidor/boschla00417/almagro/easy-rivadavia-%28e164%29-cencosud/?countrySelectorCode=AR']
['https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/localizador-de-distribuidores/distribuidor/boschla00077/almagro/colombo-fernando-javier/?countrySelectorCode=AR', 
'https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/localizador-de-distribuidores/distribuidor/boschla00417/almagro/easy-rivadavia-%28e164%29-cencosud/?countrySelectorCode=AR', 
'https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/localizador-de-distribuidores/distribuidor/boschla00506/almagro/g-y-p-new-tree-s.a/?countrySelectorCode=AR']
['https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/localizador-de-distribuidores/distribuidor/boschla00077/almagro/colombo-fernando-javier/?countrySelectorCode=AR', 
'https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/localizador-de-distribuidores/distribuidor/boschla00417/almagro/easy-rivadavia-%28e164%29-cencosud/?countrySelectorCode=AR', 
'https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/localizador-de-distribuidores/distribuidor/boschla00506/almagro/g-y-p-new-tree-s.a/?countrySelectorCode=AR', 
'https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/localizador-de-distribuidores/distribuidor/boschla00303/almagro/medrano-construcciones-s./?countrySelectorCode=AR']
['https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/localizador-de-distribuidores/distribuidor/boschla00077/almagro/colombo-fernando-javier/?countrySelectorCode=AR', 
'https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/localizador-de-distribuidores/distribuidor/boschla00417/almagro/easy-rivadavia-%28e164%29-cencosud/?countrySelectorCode=AR', 
'https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/localizador-de-distribuidores/distribuidor/boschla00506/almagro/g-y-p-new-tree-s.a/?countrySelectorCode=AR', 
'https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/localizador-de-distribuidores/distribuidor/boschla00303/almagro/medrano-construcciones-s./?countrySelectorCode=AR', 
'https://www.bosch-professional.com/ar/es/dl/localizador-de-distribuidores/localizador-de-distribuidores/distribuidor/boschla00304/almagro/medrano-construcciones-s.a./?countrySelectorCode=AR']

I thought this might be due to the 'a' mode in the .write function, but if I use 'w' I just get the last link. And this url I'm yielding is just one in over 700, so the initial .text created was quite large and useless.

Thanks in advance for any help you can provide. I feel this is some really dumb problem I'm just not seeing.

Related Questions