Loading...
Playground
Q-learning discovers when to push and when to coast—just like pumping a swing!
Rope, friction, and wind change the challenge
Reward is swing height (1−cos θ). The agent learns left/right pushes to build amplitude.
Purple robot = agent · bar = height
Steps this episode
0.0000
Episode return
0.0000
Height
0.0036
High swings
0.0000
Higher swings raise the curve