Everyone's AI
Machine learningPlayground

Playground

Hands-on AI lab

Deep Learning

  • NN Classifier

Reinforcement Learning

  • RL Agent

Computer Vision

  • Conv Vision

Transformer

  • Attention Playground
NN ClassifierRL AgentConv VisionAttention Playground
Loading...

Playground

Swing RL

Q-learning discovers when to push and when to coast—just like pumping a swing!

Episode0
Training settings

Push on the way down, coast on the way up—the Q-table learns this timing from rewards.

  • Learning rate α: How much each Q update moves. Large values learn fast but can oscillate.
  • Discount γ: How much future rewards matter now. Closer to 1 weights distant rewards more.
  • Exploration ε: Random push/coast chance. High ε tries many rhythms; low ε sticks to what worked.

Swing setup

Rope, friction, and wind change the challenge

Reward is swing height (1−cos θ). The agent learns left/right pushes to build amplitude.

Rope:
1.20 m
Friction:
0.035
Push:
2.2
  • Push opposite to motion near the bottom to add energy
  • Coasting near the top usually helps

Swing simulator

Purple robot = agent · bar = height

Steps this episode

0.0000

Episode return

0.0000

Height

0.0036

High swings

0.0000

🤖Height 0%Episode best 0%
θ -4.9°ω -0.17

Episode return

Higher swings raise the curve

Start training to see per-episode returns

Related lessons

  • Markov Chain: State Transitions and Stochastic Processes
  • Monte Carlo Integration: Numerical Approximation
  • MDP and Bellman Equation: Mathematical Basis of Reinforcement Learning