Modeling Urban Behaviors and City Dynamics with Large-Scale LLM-Driven Agent Simulation
Modeling human behavior in urban environments is fundamental for social science, behavioral studies, and urban planning. Prior work often rely on rigid, hand-crafted rules, limiting their ability to simulate nuanced intentions, plans, and adaptive behaviors. Addressing these challenges, we envision an urban simulator (CitySim), capitalizing on breakthroughs in human-level intelligence exhibited by large language models. In CitySim, agents generate realistic daily schedules using a recursive value-driven approach that balances mandatory activities, personal habits, and situational factors. To enable long-term, lifelike simulations, we endow agents with beliefs, long-term goals, and spatial memory for navigation. CitySim exhibits closer alignment with real humans than prior work, both at micro and macro levels. Additionally, we conduct insightful experiments by modeling tens of thousands of agents and evaluating their collective behaviors under various real-world scenarios, including estimating crowd density, predicting place popularity, and assessing well-being.
CitySim is a scalable city simulation framework empowered by LLMs. Agents autonomously generate daily schedules and long-term plans through a recursive, value-driven planning process that balances mandatory activities, personal habits, and situational context. Each agent is equipped with spatial and temporal memories, enabling them to recall past experiences, form and update beliefs about places, and adapt their future decisions accordingly.
Figure 1: Overview of CitySim: LLM-based agents with diverse personas plan daily activities, interact socially, and navigate a virtual city environment through Activity Planning, Social Interactions, and Mobility Prediction modules.
CitySim agents are equipped with advanced cognitive representations for realistic urban behavior:
Includes demographic attributes, spatial anchors (home/work), Big Five personality traits, and empirically-derived habits and preferences from real-world surveys.
Comprises temporal memory (chronological events), reflective memory (thoughts and attitudes), and spatial memory (beliefs about POIs updated via Kalman filtering).
Tracks basic needs (hunger, energy, social) and long-term goals that evolve based on experiences. Needs decay over time and influence activity selection.
Recursive planning approach that assigns importance scores to activities, balancing mandatory tasks, personal habits, and situational context for realistic schedules.
GPT-4o compared anonymized daily routines using naturalness, coherence, and plausibility criteria. CitySim achieves the highest win rate:
| Method | Avg Win Rate | vs. AgentSociety | vs. MobileCity |
|---|---|---|---|
| GeAn | 0.21 | 0.18 | 0.24 |
| HumanoidAgent | 0.28 | 0.31 | 0.29 |
| AgentSociety | 0.42 | - | 0.48 |
| MobileCity | 0.39 | 0.52 | - |
| CitySim | 0.61 | 0.67 | 0.64 |
We assess CitySim's ability to estimate population well-being using a proprietary survey dataset:
| Method | F1-macro |
|---|---|
| GeAn | 0.19 |
| AGA | 0.20 |
| HumanoidAgent | 0.22 |
| MobileCity | 0.21 |
| AgentSociety | 0.28 |
| CitySim | 0.36 |
| GBDT (Upper Bound) | 0.45 |
Figure 2: Comparison of simulated (left) and real-world (right) crowd density heatmaps in Shibuya, Tokyo. CitySim accurately reproduces mobility patterns with highest densities around central transit nodes and major commercial streets.