Mastering Among Us: How AI Learns to Detect Lies and Coordinate with Reinforced Learning
Mastering Among Us: How AI Learns to Detect Lies and Coordinate with Reinforced Learning. Stanford researchers developed an AI system that excels at social deduction games like Among Us, using reinforcement learning to improve communication and lie detection.
24 בפברואר 2025

Discover how AI agents can excel at social deduction games like Among Us through reinforcement learning and multi-agent communication. This approach enables AI to effectively interrogate, detect lies, and coordinate with other agents - skills that could have far-reaching implications.
How the Game of Among Us Works
The Challenges of Using Reinforcement Learning for Social Deduction Games
The Proposed Approach: Rewarding Effective Communication and Interpretations
Emergent Behaviors and Improved Success Rates
Relating to Reference Games and Theory of Mind
The Importance of Rich Reward Signals and Self-Play
Impressive Results Across Different Game Environments
Key Takeaways: AI's Improved Interrogation and Coordination Abilities, and the Power of Identifying Rich Reward Signals
Conclusion
How the Game of Among Us Works
How the Game of Among Us Works
Among Us is a popular multiplayer game where players are split into two groups: the uninformed majority (crewmates) and the informed minority (imposters). The goal of the crewmates is to identify and vote out the imposters, while the imposters aim to avoid detection and eliminate the crewmates.
The game mechanics are as follows:
- Crewmate Tasks: Crewmates have to complete various tasks, such as solving puzzles or operating switches, to win the game.
- Impostor Sabotage: The imposters' sole objective is to kill the crewmates without being caught.
- Reporting Corpses: When a crewmate discovers a dead body, they can report it, triggering a discussion phase.
- Discussion Phase: During this phase, all players (both crewmates and imposters) can discuss what they have observed and try to identify the imposters.
- Voting: At the end of the discussion, players vote to decide which player they believe is the imposter. If the majority vote correctly, the crewmates win. If the imposter is not identified, the imposter wins.
The key challenge in the game is that the imposters can lie and misdirect the crewmates during the discussion phase, making it difficult for the crewmates to determine the truth. Effective communication and coordination among the crewmates are crucial to successfully identifying the imposters.
The Challenges of Using Reinforcement Learning for Social Deduction Games
The Challenges of Using Reinforcement Learning for Social Deduction Games
The paper highlights several key challenges in using reinforcement learning (RL) for social deduction games like Among Us:
-
Sparse Reward Signal: The sparse reward signal of winning or losing the game at the end is not informative enough to reinforce high-quality discussions between agents. Even if an agent votes correctly, they may still lose the game, and vice versa.
-
Lack of Signal for Message Effectiveness: Agents do not have a strong signal for understanding the helpfulness of the messages they send or for learning the meaning of messages from other players.
-
Complexity of Social Deduction Games: Social deduction games are more complicated than simple reference games, as the ground truth is not known, and agents must communicate to collectively arrive at the answer, while the impostor tries to mislead the conversation.
To address these challenges, the paper proposes an approach that rewards messages generated during the discussion phase based on how they change the other crew mates' beliefs about the identity of the impostor, using the ground truth known to the model. This provides a richer reward signal for both sending effective messages and interpreting messages from other agents.
The paper also employs an iterated self-play algorithm, where crew mates and imposters train against earlier iterations of their adversaries' policies, similar to techniques used in games like Go and Chess. This allows the agents to learn effective communication strategies through repeated practice, without the need for human demonstration data.
The results show that this approach significantly outperforms the baseline RL models, achieving up to twice the win rate in the base environment. The authors also find that the agents learn emergent behaviors commonly seen in human games of Among Us, such as directly accusing players and providing evidence to support their claims.
Overall, the paper demonstrates the power of identifying rich reward signals and using self-play to enable RL agents to excel at complex social deduction games, without relying on large amounts of human-provided data.
The Proposed Approach: Rewarding Effective Communication and Interpretations
The Proposed Approach: Rewarding Effective Communication and Interpretations
The key proposal of this paper is to reward the agents based on how their messages and interpretations of other agents' messages impact the other crewmates' beliefs about the identity of the impostor. Specifically:
-
The paper proposes an approach that rewards a message generated during the discussion phase based on how it may have changed the other crewmates' perception of who the impostor is, compared to the ground truth.
-
Not only do they need a reward signal for how good their messages to the other crewmates are, but they also need a reward signal for how well they interpret messages from other crewmates. This rewards both speaking and listening.
-
The technique rewards messages sent to other crewmates that cause those other crewmates to believe the impostor is the actual impostor. Conversely, it rewards interpretations of other messages that cause the agent to believe the impostor is the actual impostor.
-
This approach allows the agents to learn to not only accuse other crewmates of being the potential impostor, but also provide proof and evidence to back up their claims.
-
Importantly, this method can be done without any human demonstration data, relying solely on reinforcement learning and self-play, where the agents play the game over and over to figure out the most effective communication strategies.
Emergent Behaviors and Improved Success Rates
Emergent Behaviors and Improved Success Rates
The paper's key findings demonstrate that the proposed approach results in emergent behaviors commonly found in real games of Among Us between humans, such as directly accusing players and providing evidence to help other crewmates. Furthermore, this method achieves a two-times higher success rate relative to standard reinforcement learning (RL) approaches, along with over a three-times higher success rate compared to base models that are over four times larger than the authors' models.
The authors attribute this significant performance improvement to their technique of rewarding messages generated during the discussion phase based on how they change the other crewmates' beliefs about the identity of the impostor. This reward signal, which is based on the ground truth of the game, provides a much richer learning signal for the agents compared to the sparse win/loss signal at the end of the game.
Additionally, the paper highlights the importance of training the agents to not only generate effective messages but also to interpret the messages from other agents accurately. By rewarding the agents for their ability to correctly identify the impostor based on the messages they receive, the authors were able to further boost the agents' performance.
The authors also employed an iterated self-play algorithm, similar to the techniques used by AlphaGo and chess engines, where the crewmates and imposters train against earlier iterations of their adversaries' policies. This self-play approach allows the agents to continuously improve their strategies and coordination, leading to the impressive results observed in the experiments.
Overall, the paper demonstrates the power of reinforcement learning in enabling AI agents to excel at complex social deduction games like Among Us, without the need for extensive human demonstration data. The authors' approach of identifying rich reward signals and leveraging self-play can serve as a blueprint for applying similar techniques to other domains where the reward signal may not be immediately obvious.
Relating to Reference Games and Theory of Mind
Relating to Reference Games and Theory of Mind
The paper discusses the connection between the social deduction game of Among Us and the concept of reference games and theory of mind.
Reference games are a type of task where a speaker needs to communicate to listeners in a way that allows them to identify a specific image or object from a set. Humans are naturally adept at these tasks, using theory of mind reasoning to determine the speaker's intent and select the correct image.
Social deduction games like Among Us are more complex, as the ground truth is not known to all players. Teams must communicate to collectively determine the correct answer, while the impostor tries to mislead the discussion. The paper notes that the sparse reward signal of winning or losing the game makes it difficult to utilize communication effectively using reinforcement learning alone.
To address this, the proposed approach takes advantage of the social deduction component of the game. By rewarding messages that change other players' beliefs about the impostor's identity, based on the known ground truth, the agents can learn to communicate more effectively without requiring large datasets of human demonstrations.
The paper also discusses the use of self-play, where the agents train against earlier versions of their adversaries' policies, similar to techniques used in games like Go and chess. This iterative self-play allows the agents to continuously improve their coordination and communication abilities.
Overall, the key insights are the identification of a rich reward signal based on the social deduction aspect of the game, and the use of self-play to enable effective multi-agent communication and reasoning without relying on extensive human data.
The Importance of Rich Reward Signals and Self-Play
The Importance of Rich Reward Signals and Self-Play
The key proposal of this paper is to reward the messages generated during the discussion phase based on how they change the other crewmates' beliefs about the identity of the impostor. This provides a rich reward signal that does not require vast amounts of human data to train the models using reinforcement learning.
The paper finds that this approach results in emergent behaviors commonly found in real games of Among Us, such as directly accusing players and providing evidence to help other crewmates. The technique achieves a two-times higher success rate relative to standard reinforcement learning, and over three-times higher success rate compared to larger base models.
The paper highlights the importance of identifying rich reward signals, even in situations where the obvious reward signal (winning or losing the game) is not informative enough for effective reinforcement learning. By leveraging the ground truth about the impostor's identity, the agents can be trained to improve their reasoning and communication without relying on human demonstration data.
Additionally, the paper employs an iterated self-play algorithm, where the crewmates and imposters train against earlier iterations of their adversaries' policies. This self-play approach, similar to techniques used in games like Go and Chess, allows the agents to continuously improve their performance through repeated interactions.
The key takeaways from this work are:
- AI agents can now excel at interrogation and spotting lies, with significant implications for various applications.
- Coordinated multi-agent systems can be even more powerful when the reward signals are carefully designed.
- The ability to identify rich reward signals, even in non-obvious scenarios, is crucial for effective reinforcement learning and can lead to impressive results with small models and limited resources.
- Self-play, where agents train against earlier versions of themselves, is a powerful technique for driving continuous improvement in complex multi-agent environments.
This paper demonstrates the potential of reinforcement learning and self-play in tackling challenging multi-agent coordination tasks, such as social deduction games, without relying on large amounts of human demonstration data. The insights gained from this work can be applied to a wide range of domains where identifying the right reward signals is key to enabling AI systems to reach new levels of performance.
Impressive Results Across Different Game Environments
Impressive Results Across Different Game Environments
The paper presents impressive results across various game environments in the social deduction game Among Us. The researchers explored different configurations of the game, varying the environment shape, number of tasks, and number of players.
The key findings are:
- The base model, without any additional techniques, performed the worst across all configurations.
- Increasing the model size (from a base model to a 7 billion parameter version) provided some improvement, but still struggled to reason about the identity of the imposters.
- Reinforcement learning (RL) alone, without the additional listening and speaking rewards, significantly boosted the performance compared to the base models.
- Training the model to only optimize its listening ability, without RL, was an effective baseline, as predicting the identity of the imposter is valuable in Among Us.
- Combining RL with the listening and speaking rewards dramatically increased the success rate, achieving twice the win rate of the RL-only baseline in the base environment.
- The self-play iteration process further improved the win rate, demonstrating the effectiveness of this technique.
The researchers conclude that despite having weak base models, the agents were able to learn to speak effectively and extract information from discussion messages. Additionally, the agents were robust to adversarially-trained imposters, who were unable to break the crew mates' coordination during discussions.
Key Takeaways: AI's Improved Interrogation and Coordination Abilities, and the Power of Identifying Rich Reward Signals
Key Takeaways: AI's Improved Interrogation and Coordination Abilities, and the Power of Identifying Rich Reward Signals
-
AI agents have significantly improved their abilities in social deduction games that require interrogation and multi-agent communication, such as the game Among Us. They can now effectively interrogate, detect lies, and coordinate with other agents to identify the impostor.
-
The key to this improvement is the researchers' ability to identify a rich reward signal that does not require vast amounts of human demonstration data. By rewarding agents based on how their messages and interpretations impact the other agents' beliefs about the impostor's identity, the agents can learn effective communication strategies without relying on human examples.
-
The use of reinforcement learning and self-play allows the agents to iteratively improve their performance, similar to how AlphaGo and chess engines have become highly skilled through self-play.
-
This approach of identifying rich reward signals has broader implications beyond social deduction games. The ability to find verifiable reward signals, even in domains where the "right answer" is not obvious, is crucial for enabling AI systems to excel in a wide range of tasks and applications without requiring extensive human-provided data.
-
The success of this technique highlights the power of reinforcement learning and the importance of finding the right reward signals. This could lead to AI agents becoming highly skilled at tasks that involve interrogation, deception detection, and multi-agent coordination, with potential applications in various fields.
Conclusion
Conclusion
The key takeaways from this research are:
-
AI agents can now excel at social deduction games like Among Us that require interrogation and multi-agent communication to identify the "bad guy".
-
The researchers developed a reinforcement learning approach that rewards agents based on how their messages impact the other players' beliefs about the impostor's identity. This provides a rich reward signal without requiring large amounts of human demonstration data.
-
The agents trained using this method exhibit emergent behaviors commonly seen in human gameplay, such as directly accusing players and providing evidence to support their claims.
-
This technique results in a 2x higher success rate compared to standard reinforcement learning, and over 3x higher than larger baseline models.
-
The ability to identify effective reward signals, even in complex social scenarios, is a key enabler for applying reinforcement learning to a wide range of tasks beyond just games. This could have significant implications for AI's ability to engage in effective communication and reasoning.
-
The self-play training approach, where agents iteratively improve by playing against previous versions of themselves, is a powerful technique that has driven major breakthroughs in other domains like chess and Go.
In summary, this research demonstrates how reinforcement learning with carefully designed reward signals can enable AI agents to excel at social deduction tasks that require sophisticated communication and reasoning skills. This points to a path for applying similar techniques to unlock AI's potential in a variety of real-world domains.
שאלות נפוצות
שאלות נפוצות