Published on: 2025-05-02
Description: Exploring the implementation of a Real Business Cycle model using reinforcement learning techniques, with insights on economic agent behavior, capital dynamics, and steady-state equilibrium.
Written by: Fadi Atieh
#RL
Over the past few weeks, I’ve been tinkering with a reinforcement learning (RL) simulation of a simple Real Business Cycle (RBC) model. I started from first principles and ran into a fascinating tangle of conceptual and technical questions. This post summarizes what I’ve learned so far.
The simplest economy is a “one-man army.” Every day, our solitary economic agent makes a few key decisions:
This is the foundational intuition behind RBC models. From here, everything else grows in complexity.
This part still feels hazy.
This leads to a deeper question about modeling local vs. global behavior. Can a society lend to itself? Can a lone agent simulate that structure? Still open questions for me.
In a borrowing-free model (pure investment and consumption), I solved for a steady state with the following parameters:
Results:
These values make intuitive sense: high productivity leads to accumulation of capital and a relatively low labor requirement due to disutility.
One big question I’m grappling with:
If I simulate this setup using RL, shouldn’t the agent eventually converge to the steady state?
My intuition says yes, because of the Banach fixed-point theorem—if a unique fixed point exists and the Bellman operator is a contraction, the agent should find its way there.
But here’s the catch…
The mathematical form of the RBC problem in this setup is:
But here’s the problem: the per-stage cost is unbounded.
So value iteration or policy iteration algorithms may not converge unless we manually bound the control space.
Constrain the action space:
But even then, the state space is technically still infinite. Discretizing won’t help unless we bound it too, which means imposing some and truncating.
Otherwise, we’d need function approximation (e.g., neural nets) to estimate value functions in continuous space.
In practice, my RL agent doesn’t always converge to the steady state. Possible reasons:
This model started as a toy, but I’m realizing it exposes many foundational issues in applying RL to economic systems:
Still early days, but I’m learning a lot—and that’s the whole point.