So how does the RL agent learn to trade stocks?
As was discussed above, you don’t have to provide labels at each time step to the RL algorithm. The RL algorithm initially learns to trade through trial and error and receives a reward when the trade is closed. And later optimizes the strategy to maximize the rewards. This is different than traditional ML algorithms which require labels at each time step or at a certain frequency.
Specifically our system uses an AI technique known as reinforcement learning combined with genetic algorithms to to train our AI system to identify potential entry and exit points that could return significant profits. Reinforcement learning involves training an AI agent with trial and error actions to try to achieve a particular goal, be that winning a game of chess or optimizing investment returns (our case). By giving the AI agent rewards when it performs the best action(s) in response to specific situations, it learns which actions it should perform. also relies on genetic algorithms to fine tune the for each stock because most stocks move and trade differently.