You can choose to center your project around the application of reinforcement learning to a real-world problem. We will this year focus on an application in sustainable energy (see below).
RL algorithm:
If you focus on an application, we advise to use an estabilished (model-free) RL algorithm from CleanRL, such as PPO (Code).
Chargax is an internally developed, high-speed simulator for EV charging. The course teachers can also easily provide additional support for this tool.
The challenge is to schedule arriving cars in the optimal way to charging sites. The best strategies depends on complex patterns in weather, car arrival distributions, customers rest time, battery charge, market prices, etc.
Wind and solar energy have varying production profiles. To keep the energy grid stable, the energy production of renewable sources need to be absorbed and (later) delivered back by battery storage. This is achieved by making bids into the electricity market.