Add more policy gradient algorithms (PPO, RPO, DDPG, TD3, etc.)

To enhance the capabilities and benchmarking potential of this repository, it would be beneficial to implement additional policy gradient algorithms. Suggested additions include:

- PPO (Proximal Policy Optimization)
- RPO (Robust Policy Optimization)
- DDPG (Deep Deterministic Policy Gradient)
- TD3 (Twin Delayed Deep Deterministic Policy Gradient)

These algorithms are widely used in reinforcement learning research and would broaden the range of experiments possible with this codebase. Contributions should ensure modularity and maintainability, ideally following the design patterns already present in the repo.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add more policy gradient algorithms (PPO, RPO, DDPG, TD3, etc.) #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add more policy gradient algorithms (PPO, RPO, DDPG, TD3, etc.) #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions