-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Description
Yahoo finance seems to set its historical data URLs with the month variables set back minus one, and DataReader doesn't seem to accommodate this.
For example, running:
ticker = '^GSPC'
start_dt = datetime(2008, 6, 30)
end_dt = datetime(2010, 12, 31)
data = DataReader(ticker, 'yahoo', start=start_dt, end=end_dt)
Then:
In[27]: data.index[0]
Out[27]: datetime.datetime(2008, 7, 30, 0, 0)
data.index[-1]
Out[28]: datetime.datetime(2011, 11, 2, 0, 0)
Shows that the pull started in July, not June as intended, and ended on the latest available day instead of December 31, 2010.
Looking at the source, DataReader looks like it constructs the following URL for the download:
http://ichart.yahoo.com/table.csv?s=^GSPC&a=6&b=30&c=2008&d=12&e=31&f=2010&g=d&ignore=.csv
If I go get this data from Yahoo manually, I get:
http://ichart.finance.yahoo.com/table.csv?s=^GSPC&a=05&b=30&c=2008&d=11&e=31&f=2010&g=d&ignore=.csv
(Whether you use .finance or not doesn't seem to matter).
The hack to get around this isn't straightforward, since you have to pass a datetime to DataReader, and you can't create datetime(2010, 11, 31). I'm assuming Yahoo is giving me all data to today because there is no month where d=12 in the URL (December is 11, January is 0).