Reinforcement Learning Navigation Model

MIT PhD Dissertation 2021

The Machu Picchu Design Heritage project was made possible thanks to the MISTI Global Seed Funds. MISTI is a part of the Center for International Studies within the School of Humanities, Arts, and Social Sciences (SHASS). The project was also sponsored by the Council of Science, Technology and Technological Innovation of Peru, with the support of the National University of Saint Anthony the Abbot in Cuzco and the Decentralized Directorate of Culture of Cusco.

"The shape of the environment elicits patterns of behavior while humans are exploring, learning and navigating such sites. My vision is to study and simulate how humans/tourists navigate architecture/heritage sites, to evaluate their experience, interacting with each other and with the environment. Conventional simulation methods have included rule based models as procedural algorithms, synthetic models, and crowd simulation, yet machine learning has vastly expanded the field. In the thesis, we show that our reinforcement learning based method is a new and valid model to simulate visitor�s behavior because it relies on data from humans and from the environment.

Reinforcement learning capabilities gauge the nuances of navigation in a way that procedural algorithms, the most popular method, cannot. One example is how people interact with the architecture of these tourist sites; many people stray from designated paths to sometimes find valuable views, and sometimes to find less interesting places. Yet, it is exceedingly labor intensive to model site exploration with procedural algorithms, equivalent to prescribing a non goal oriented task. Exploration has often been modeled as suboptimal behavior. On the contrary, reinforcement learning enables us to extract tourist exploratory behavior directly from the data. The model shows a "mode" that constitutes an inquiry about space.

This thesis seeks to demonstrate the modeling potential of reinforcement learning for simulating human navigation patterns by showing accurate and nuanced simulated behavior, automatization of the process, laborsaving, generalizable power and expansive capabilities. The research steps consist of: collect data from human trajectories in real world conditions with aerial recording. This required extensive sessions of data collection on the field, and data processing. The data was collected in Machu Picchu, selected because of its particular setting as an architectural site without roofs. We focus mainly on the Two Mirrors Temple. Next, we build the reinforcement learning model of the agents. Some of the procedural programming character design methods were applied into the model. Further, cognitive science principles were applied as well. Path integration and map building are central components of human navigation that were incorporated. Finally we train the agents with the human trajectory data. After training, the agents show exploratory navigation on the sites, and get affected by the geometry of the architecture. A theoretical framework that identifies the impact of our work and further steps was developed. In addition, we show that using data from one site, it is generalizable to a new site, and the model still achieves exploratory behavior of the agents.

One of the main contributions of our work is that our model evaluates the maximum number of visitors that the site can receive, while still protecting the site. The model enables researchers, managers and designers to distribute the visitors in accordance to different parameters. "

Date:

Last updated, 03.31.2020