I'm trying to iterate through a JSON response from a page using js2xml. The question I have, is how do I call the 'stores' node and pass only that as my response? The JSON looks like this:
<script>
window.appData = {
"ressSize": "large",
"cssPath": "http://css.bbystatic.com/",
"imgPath": "http://images.bbystatic.com/",
"jsPath": "http://js.bbystatic.com/",
"bbyDomain": "http://www.bestbuy.com/",
"bbySslDomain": "https://www-ssl.bestbuy.com/",
"isUserLoggedIn": false,
"zipCode": "46801",
"stores": [{
"id": "2727",
"name": "GLENBROOK SQUARE",
"addr1": "4201 coldwater rd",
"addr2": "spc g10",
"city": "fort wayne",
"state": "IN",
"country": "US",
"zipCode": "46805",
"phone": "260-482-5230"...
<\script>
My spider for this is straight forward but I can't seem to come up with what I need to parse the 9th node 'stores'. This is what I've got so far:
def parse(self, response):
js = response.xpath('//script[contains(.,"window.appData")]/text()').extract_first()
jstree = js2xml.parse(js)
jstree.xpath('//assign[left//identifier[@name="appData"]]/right/*')
js2xml.make_dict(jstree.xpath('//assign[left//identifier[@name="appData"]]/right/*')[0])`
The response to this gives me:
<program>
<assign operator="=">
<left>
<dotaccessor>
<object>
<identifier name="window"/>
</object>
<property>
<identifier name="appData"/>
</property>
</dotaccessor>
</left>
<right>
<object>
<property name="ressSize">
<string>large</string>
</property>
<property name="cssPath">
<string>http://css.bbystatic.com/</string>
</property>
<property name="imgPath">
<string>http://images.bbystatic.com/</string>
</property>
<property name="jsPath">
<string>http://js.bbystatic.com/</string>
</property>
<property name="bbyDomain">
<string>http://www.bestbuy.com/</string>
</property>
<property name="bbySslDomain">
<string>https://www-ssl.bestbuy.com/</string>
</property>
<property name="isUserLoggedIn">
<boolean>false</boolean>
</property>
<property name="zipCode">
<string></string>
</property>
<property name="stores">
<array/>
</property>
<property name="preferredStores">
<array/>
</property>
</object>
</right>
</assign>
</program>
{'bbyDomain': 'http://www.bestbuy.com/',
'bbySslDomain': 'https://www-ssl.bestbuy.com/',
'cssPath': 'http://css.bbystatic.com/',
'imgPath': 'http://images.bbystatic.com/',
'isUserLoggedIn': False,
'jsPath': 'http://js.bbystatic.com/',
'preferredStores': [],
'ressSize': 'large',
'stores': [],
'zipCode': ''}
Any thoughts would be helpful!
Let's use New York as location, http://www.bestbuy.com/site/store-locator/11356
So you have 25 stores for New York. You can simply loop on
app_data["stores"]
.In your Scrapy callback, you can translate this like this: