Yahoo Finance Download Data

Asked by At

I am trying to scrape finance.yahoo.com and download a data file. Specifically, this url: https://finance.yahoo.com/quote/AAPL/history?p=AAPL

I would like to complete two objectives here: 1) I would like to set the data time period parameters to "Max", which I believe I would need to use Selenium and 2) would like to download and save the data file that is embedded in the href that appears when inspect "Download Data".

So far, I am unable to access the drop-down required to click "Max" and also cannot locate the href required to download the file.

Any help would be greatly appreciated.

from selenium import webdriver
import time
from selenium.webdriver.chrome.options import Options

options = webdriver.ChromeOptions()
options.add_argument('--log-level=3')

stock = input()
base_url = 'https://finance.yahoo.com/quote/{}/history?p= 
{}'.format(stock,stock)
driver = webdriver.Chrome()
driver.get(base_url)
driver.maximize_window()
driver.implicitly_wait(4)
driver.find_element_by_class_name("Fl(end) Mt(3px) Cur(p)").click()
time.sleep(4)
driver.quit()

2 Answers

0
QHarr On Best Solutions

The following shows selectors you can use. I haven't added any wait conditions as the only one needed, in my test runs, I couldn't find; the wait for all new data to be present after pressing apply button. Instead, I use a hard coded time.sleep(5) which should be replaced with a better condition based wait if possible.

from selenium import webdriver
# from selenium.webdriver.common.by import By
# from selenium.webdriver.support.ui import WebDriverWait
# from selenium.webdriver.support import expected_conditions as EC
import time

d = webdriver.Chrome()
d.get('https://finance.yahoo.com/quote/AAPL/history?p=AAPL')
try:
    d.find_element_by_css_selector('[name=agree]').click() #oauth
except:
    pass

d.find_element_by_css_selector('[data-icon=CoreArrowDown]').click() #dropdown
d.find_element_by_css_selector('[data-value=MAX]').click() #max
d.find_element_by_css_selector('button.Fl\(start\)').click() # done
d.find_element_by_css_selector('button.Fl\(end\) span').click() #apply
time.sleep(5)
d.find_element_by_css_selector('[download]').click() #download
0
Crayons On

You can eliminate #1 right off the bat -- just view the page directly, passing the parameters as requested.

The base URI is: https://finance.yahoo.com/quote/AAPL/history

The available parameters are: period1, period2, interval, filter and frequency.

Pretty simple, just grab now as an epoch timestamp, and use it as the period2 parameter, where period1 can simply be the beginning epoch 0. The interval and frequency can be whatever you want; daily 1d, weekly 1wk or monthly 1mo. Lastly, the filter should be history.

The completed URI:

https://finance.yahoo.com/quote/AAPL/history?period1=0&period2=1555905600&interval=1d&filter=history&frequency=1d

From there, use Selenium to locate and click the Download Data link.

I've also downvoted you because you clearly put in absolutely no effort of any kind to figure this out for yourself.

UPDATE:

As @QHarr also said, there's numerous questions all over stack overflow detailing how to work with Yahoo finance. I also recommend you give searching a whirl.