Learning to use pandas datareader to plot Yahoo share price but it does not seem right

892 views Asked by At

I'm learning with a O'Reilly course - Introduction to Pandas for Developers.

I plotted a chart using the Yahoo stock price. I have to modify the code given because it's out of date.

Here is the jupyter notebook: https://nbviewer.jupyter.org/github/jeremy886/pydata-notes/blob/master/ipython-notebooks/Time_Series.ipynb

Please skip to the bottom for the chart.

I compared my chart with the author's and the history prices from Google and found mine is different from the others. (I think the author's is different from the Google's too).

Price Info from Google: https://www.google.com.tw/search?q=yahoo+price&ie=utf-8&oe=utf-8&client=firefox-b&gfe_rd=cr&ei=vHJkWNPuKvOm8weq5qrQBw

At a glance, the pandas_datareader source seems not correct. For example, most of CLOSE prices I got are about $10. Like $12 for yesterday but the price from Google is about $38.

I wonder what is the problem?

  • Is the pandas_datareader not trustworthy anymore
  • Or there is some kind of adjustments I don't understand
  • Or there is a bug in mine/author's code

Thanks and Happy New Year.

1

There are 1 answers

2
Andrew Guy On BEST ANSWER

You are calling data.DataReader('F', 'yahoo', start, end).

From the source:

def DataReader(name, data_source=None, start=None, end=None,
               retry_count=3, pause=0.001, session=None, access_key=None):
    """
    Imports data from a number of online sources.
    Currently supports Yahoo! Finance, Google Finance, St. Louis FED (FRED),
    Kenneth French's data library, and the SEC's EDGAR Index.
    Parameters
    ----------
    name : str or list of strs
        the name of the dataset. Some data sources (yahoo, google, fred) will
        accept a list of names.
    data_source: {str, None}
        the data source ("yahoo", "yahoo-actions", "yahoo-dividends",
        "google", "fred", "ff", or "edgar-index")

The first parameter is the name of the dataset you are interested in, in your case 'F' for Ford.

The data_source parameter is the site that you are sourcing the data from. In your case, 'yahoo'. This is not the same as the stock prices. If you look at Ford's stock price, you will see they compare well to yours.

When in doubt, read the docs. If the docs don't help, read the source - https://github.com/pydata/pandas-datareader