Skip to content

Add more policy gradient algorithms (PPO, RPO, DDPG, TD3, etc.) #3

@jrcalgo

Description

@jrcalgo

To enhance the capabilities and benchmarking potential of this repository, it would be beneficial to implement additional policy gradient algorithms. Suggested additions include:

  • PPO (Proximal Policy Optimization)
  • RPO (Robust Policy Optimization)
  • DDPG (Deep Deterministic Policy Gradient)
  • TD3 (Twin Delayed Deep Deterministic Policy Gradient)

These algorithms are widely used in reinforcement learning research and would broaden the range of experiments possible with this codebase. Contributions should ensure modularity and maintainability, ideally following the design patterns already present in the repo.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions