When generating the big table of 1s timeframe (generally greater than 10000 rows) I found data shifted due to 5 missing rows (5m skipped) at 999-1000, 1999-2000, 2999-3000 and so on.
This also occurs with 1m timeframe (guess this may occur with 1h however not enough candles back to the past to test)
Part of the result I got is here (1s TF)
.
.
.
995 2020-06-05 21:46:35+07:00 9705.19 9706.02 9705.19 9706.02
996 2020-06-05 21:46:36+07:00 9706.02 9706.02 9706.02 9706.02
997 2020-06-05 21:46:37+07:00 9705.77 9706.02 9705.77 9706.02
998 2020-06-05 21:46:38+07:00 9706.02 9706.72 9706.02 9706.72
999 2020-06-05 21:46:39+07:00 9706.72 9706.72 9706.72 9706.72 **21:46:39**
1000 2020-06-05 21:51:39+07:00 9698.76 9698.76 9698.76 9698.76 **21:51:39**(5m skipped)
1001 2020-06-05 21:51:40+07:00 9698.76 9698.76 9698.76 9698.76
1002 2020-06-05 21:51:41+07:00 9698.76 9698.76 9698.76 9698.76
1003 2020-06-05 21:51:42+07:00 9698.76 9698.76 9698.76 9698.76
1004 2020-06-05 21:51:43+07:00 9698.87 9698.88 9698.87 9698.88
1005 2020-06-05 21:51:44+07:00 9698.88 9698.88 9698.88 9698.88
.
.
.
1995 2020-06-05 22:08:14+07:00 9684.71 9684.71 9684.71 9684.71
1996 2020-06-05 22:08:15+07:00 9684.71 9684.71 9684.71 9684.71
1997 2020-06-05 22:08:16+07:00 9684.71 9684.71 9684.71 9684.71
1998 2020-06-05 22:08:17+07:00 9684.71 9684.71 9684.71 9684.71
1999 2020-06-05 22:08:18+07:00 9684.71 9684.71 9684.71 9684.71 **22:08:18**
2000 2020-06-05 22:13:18+07:00 9677.95 9677.95 9677.95 9677.95 **22:13:18**(5m skipped)
2001 2020-06-05 22:13:19+07:00 9677.95 9677.95 9677.95 9677.95
2002 2020-06-05 22:13:20+07:00 9677.66 9679.82 9677.66 9679.82
2003 2020-06-05 22:13:21+07:00 9679.82 9679.82 9679.82 9679.82
2004 2020-06-05 22:13:22+07:00 9679.82 9679.82 9679.82 9679.82
2005 2020-06-05 22:13:23+07:00 9679.82 9679.82 9679.82 9679.82
.
.
.
And, 1m TF
.
.
.
995 2020-06-06 14:05:00+07:00 9612.17 9617.92 9612.00 9617.41
996 2020-06-06 14:06:00+07:00 9617.75 9621.15 9615.25 9618.87
997 2020-06-06 14:07:00+07:00 9618.95 9618.96 9618.32 9618.50
998 2020-06-06 14:08:00+07:00 9618.36 9619.00 9617.04 9618.60
999 2020-06-06 14:09:00+07:00 9618.61 9624.30 9618.61 9624.30 **14:09:00**
1000 2020-06-06 14:14:00+07:00 9620.23 9620.48 9619.27 9620.05 **14:14:00**(5m skipped)
1001 2020-06-06 14:15:00+07:00 9619.72 9623.24 9615.46 9615.46
1002 2020-06-06 14:16:00+07:00 9615.41 9615.69 9613.98 9613.98
1003 2020-06-06 14:17:00+07:00 9613.50 9613.63 9609.43 9610.10
1004 2020-06-06 14:18:00+07:00 9610.10 9616.13 9610.10 9615.65
1005 2020-06-06 14:19:00+07:00 9615.91 9615.91 9612.09 9613.11
.
.
.
Does anyone encounter this issue before. Is this because I did anything wrong with the script?
def dataframe_details_func(df_ohlcv, TIMEFRAME, LIMIT):
while(len(df_ohlcv)<LIMIT):
from_ts = df_ohlcv[-1][0] + 300000
new_ohlcv = exchange.fetch_ohlcv(PAIR, timeframe=TIMEFRAME, since=from_ts, limit=LIMIT)
df_ohlcv.extend(new_ohlcv)
df_ohlcv = pd.DataFrame(df_ohlcv, columns ['datetime','open','high','low','close','volume'])
df_ohlcv['datetime'] = pd.to_datetime(df_ohlcv['datetime'], unit='ms')
df_ohlcv.datetime = df_ohlcv.datetime.dt.tz_localize('UTC').dt.tz_convert('Asia/Bangkok')
return df_ohlcv
df_ohlcv1S = dataframe_details_func(df_ohlcv1, TIMEFRAME1S, LIMIT1S)
pd.set_option('display.max_rows', None, 'display.max_columns', None)
print(df_ohlcv1S.loc[900:1200, ['datetime', 'open', 'high', 'low', 'close']])
The problem is
That statement is literally saying "start this chunk 5 minutes after the end of the last chunk". You don't want the 300000 delta here. Perhaps 1000, to start at the next second.