Abstract

Constructing a universal moral code for artificial intelligence (AI) is challenging because human cultures have different values, norms, and social practices. We therefore argue that AI systems should adapt to culture based on observation: Just as a child raised in a particular culture learns the specific values, norms, and behaviors of that culture, we propose that an AI system operating in a particular human community could similarly learn them as well. How AI systems might accomplish this from observing and interacting with humans has remained an open question. Here, we propose using inverse reinforcement learning (IRL) as a method for AI agents to acquire culturally relevant values implicitly from humans. We test our approach using an experimental paradigm in which AI agents use IRL to learn different reward functions, which govern the agents’ actions, by learning from variations in the altruistic behavior of human subjects from two cultural groups in an online game requiring real-time decision making. We show that an AI agent learning from a particular human cultural group can acquire the altruistic characteristics reflective of that group’s average behavior, and can generalize to new scenarios requiring altruistic judgments. Our results provide a proof-of-concept demonstration that AI agents can be endowed with the ability to learn culturally-typical behaviors and values directly from observing human behavior.

Affiliated Institutions

Related Publications

THE MIRROR-NEURON SYSTEM

▪ Abstract A category of stimuli of great importance for primates, humans in particular, is that formed by actions done by other individuals. If we want to survive, we must unde...

2004 Annual Review of Neuroscience 6868 citations

Publication Info

Year
2025
Type
article
Volume
20
Issue
12
Pages
e0337914-e0337914
Citations
0
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

0
OpenAlex

Cite This

Nigini Oliveira, Jasmine Li, Koosha Khalvati et al. (2025). Culturally-attuned AI: Implicit learning of altruistic cultural values through inverse reinforcement learning. PLoS ONE , 20 (12) , e0337914-e0337914. https://doi.org/10.1371/journal.pone.0337914

Identifiers

DOI
10.1371/journal.pone.0337914