I have trained a model using gym, stable-baselines3, and yfinance using the code below.
import gymnasium as gym
import gym_anytrading
import yfinance as yf
from stable_baselines3.common.callbacks import EvalCallback, StopTrainingOnNoModelImprovement
from sb3_contrib import RecurrentPPO
from stable_baselines3.common.monitor import Monitor
from pathlib import Path
output_location = Path("models")
data = yf.download('GOOG', start='2022-01-01', end='2023-01-01')
env = gym.make('stocks-v0', df=data, frame_bound=(5, 100), window_size=5)
model = RecurrentPPO('MlpLstmPolicy', env, verbose=1)
best_model_callback = StopTrainingOnNoModelImprovement(max_no_improvement_evals=5, min_evals=10, verbose=1)
eval_env = gym.make('stocks-v0', window_size=5)
eval_env = Monitor(eval_env, output_location.as_posix())
callback = EvalCallback(eval_env, eval_freq=1000, callback_after_eval=best_model_callback, verbose=1)
model.learn(total_timesteps=10000, callback=callback)
model.save("model")
Then I try to get inference from it using the code bellow.
data = yf.download('GOOG', start='2023-01-01', end='2024-01-01')
env = gym.make('stocks-v0', df=data, frame_bound=(5,100), window_size=5)
done = False
observation, info = env.reset()
while not done:
action, states = model.predict(observation)
observation, rewards, done, truncated, info = env.step(action)
However, that gives me this error:
ValueError: Error: Unexpected observation shape (4, 2) for Box environment, please use (5, 2) or (n_env, 5, 2) for the observation shape.
So I started adding a padding to the observations which were missing rows using the code below.
from numpy import concatenate, full
data = yf.download('GOOG', start='2023-01-01', end='2024-01-01')
env = gym.make('stocks-v0', df=data, frame_bound=(5,100), window_size=5)
done = False
observation, info = env.reset()
while not done:
if observation.shape != env.observation_space.shape:
padding = env.observation_space.shape[0] - observation.shape[0]
observation = concatenate([observation, full((padding, observation.shape[1]), 0)], axis=0)
action, states = model.predict(observation)
observation, rewards, done, truncated, info = env.step(action)
That gives me this error.
IndexError: index 106 is out of bounds for axis 0 with size 100
Furthermore, the number 106 keeps changing rather randomly.
I am using python 3.11 with these versions of packages.
stable-baselines3==2.2.1
gym==0.26.2
gym-anytrading==2.0.0
yfinance==0.2.35
sb3-contrib==2.2.1
I've been digging through all kinds of tutorials, articles, documentations, and guides but have not found any working solution yet. Does anyone know why this is happening and how to fix it? Any help is highly appreciated.