If you are interested in historical financial data, you often come across the task to get the data and prepare it for you process pipeline. You may have a backtesting system running or various scanners, so you don’t want to spend a decent amount of time in preprocessing your data to fit your pipeline.
A Small Python Helper
This is a small helper I use fairly often to e.g. download historical bitcoin prices with Python.
All you need are the libraries pandas
and yfinance
, which you can install to your virtual environment with the commands pip install yfinance pandas
.
I wrote a small helper, which converts the data from yahoo finance (which are already given as a pandas.DataFrame
) to my needs. Baisically it just sets a column as DateTime and eventually renames the column to fit my data pipeline.
You can pass different tickers like BTC-EUR, BTC-USD, ETH-EUR
etc. and get (dependent on the asset) data in various resolution, e.g. 1m,2m,5m,15m,30m,60m,90m,1h,1d,5d,1wk,1mo,3mo
. For the big crypto currencies, historical data is for sure available down to 60m
, which I use mostly in my processes.
No git repo is needed for this simple script. Have a look and adapt to your needs respectively!
Example Code
import pandas as pd
import yfinance as yf
def convert_yf_data(df: pd.DataFrame) -> pd.DataFrame:
"""
Converts a yahoo finance OHLC DataFrame to column name(s) used in this project
old_names = ['Date', 'Open', 'High', 'Low', 'Close', 'Adj Close', 'Volume']
new_names = ['Date', 'Open', 'High', 'Low', 'Close']
:param df:
:return:
"""
df_output = pd.DataFrame()
df_output['Date'] = list(df.index)
df_output['Date'] = pd.to_datetime(df_output['Date'], format="%Y-%m-%d %H:%M:%S")
df_output['Open'] = df['Open'].to_list()
df_output['High'] = df['High'].to_list()
df_output['Low'] = df['Low'].to_list()
df_output['Close'] = df['Close'].to_list()
return df_output
df = yf.download(tickers='ETH-USD',
interval="1d",
start="2021-02-28")
convert_yf_data(df).to_csv('data\eth-usd_1d.csv', sep=",", index=False)
Thats it. It will save a .csv
file to the given folder and filename, containing the data.
Date,Open,High,Low,Close
2021-02-28,1459.8604736328125,1468.3914794921875,1300.47216796875,1416.0489501953125
2021-03-01,1417.151123046875,1567.694580078125,1416.4161376953125,1564.7076416015625
2021-03-02,1564.0634765625,1597.610107421875,1461.325439453125,1492.6087646484375
2021-03-03,1491.451171875,1650.360595703125,1481.90576171875,1575.8531494140625
2021-03-04,1574.623779296875,1622.953857421875,1511.1033935546875,1541.914306640625
2021-03-05,1541.541748046875,1547.878173828125,1450.891357421875,1533.2750244140625
One Comment