I am trying to download a file off of a web site using the httr package in R. This is the web site:
There is a "Download Holdings (XLS)" button under the Portfolio Holdings section that downloads the file in your web browser.
Using the developer mode on my browser I found that the data is retrieved via a POST command. I copied all of the headers and payload information exactly as my browser has them and created this POST command using the httr package:
dta <- POST("https://www.franklintempleton.com/api/pds/price-and-performance?apikey=4ef35821-5244-41bc-a699-0192d002c3d1p&op=Holdings&id=14",
body='{"operationName":"Holdings","variables":{"countrycode":"US","languagecode":"en_US","fundid":"91616"},"query":"query Holdings($fundid: String!, $countrycode: String!, $languagecode: String!) {\n Portfolio(\n fundid: $fundid\n countrycode: $countrycode\n languagecode: $languagecode\n ) {\n fundname\n producttype\n assetclass\n portfolio {\n topholdings {\n asofdate\n hldngname\n geocode\n sectorname\n brkdwnpct\n frequency\n calcbasislocal\n calctypelocal\n allocflag\n hasderivatives\n }\n dailyholdings {\n asofdate\n frequency\n secticker\n isinsecnbr\n cusipnbr\n secname\n quantityshrpar\n origcouponrate\n sctrname\n pctofnetassets\n mktvalue\n notionalmktvalue\n secexpdate\n assetclasscatg\n mktcurr\n contracts\n }\n fullholdings {\n asofdate\n asofdatestd\n frequency\n secticker\n isinsecnbr\n cusipnbr\n secname\n quantityshrpar\n origcouponrate\n sctrname\n pctofnetassets\n mktvalue\n notionalmktvalue\n secexpdate\n assetclasscatg\n mktcurr\n contracts\n finalmaturitydate\n investmentcategory\n }\n }\n }\n}\n"}',
add_headers(.headers=c(
'Host'='www.franklintempleton.com',
'User-Agent'='Mozilla/5.0 (Windows NT 10.0; rv:121.0) Gecko/20100101 Firefox/121.0',
'Accept'='application/json',
'Accept-Language'='en-US,en;q=0.5',
'Accept-Encoding'='gzip, deflate, br',
'Content-Type'='application/json',
'Content-Length'=1401,
'Origin'='https://www.franklintempleton.com',
'DNT'=1,
'Connection'='keep-alive',
'Referer'='https://www.franklintempleton.com/investments/options/exchange-traded-funds/products/91616/SINGLCLASS/clearbridge-all-cap-growth-esg-etf/CACG',
'Sec-Fetch-Dest'='empty',
'Sec-Fetch-Mode'='cors',
'Sec-Fetch-Site'='same-origin',
'Sec-GPC'=1,
'Pragma'='no-cache',
'Cache-Control'='no-cache',
'TE'='trailers')))
When I run that command it times out. I think everything with the command is ok because I've done something similar on other web sites in the past. The web site has no problem loading in my browser on the same machine, ip, etc. Even with the same headers, can the web server still detect that this is not a browser and then decide to ignore request? Or is there another reason why this isn't working? I would rather not use something like RSelenium to do this if possible. Is there a way to make this command work using the httr package or something similar? Thank you in advance
You can get a curl command from your browser's dev. tools network tab context menu (Copy > Copy as cURL (bash) in Chrome), which you can feed into
httr2::curl_translate(), https://curlconverter.com/r/ or some similar tool to translate it tohttr/httr2. The result is not always perfect, but often good enough. Trimmed-down curlconverter.com/r/ result works as expected:Created on 2024-03-21 with reprex v2.1.0
And just in case, there is no XLSX download, Excel file is constructed from JSON data locally by your browser.