A Game-Theoretic Negotiation Framework for Cross-Cultural Consensus in LLMs

Abstract

The increasing prevalence of large language models (LLMs) is influencing global value systems. However, these models frequently exhibit a pronounced WEIRD (Western, Educated, Industrialized, Rich, Democratic) cultural bias due to lack of attention to minority values. This monocultural perspective may reinforce dominant values and marginalize diverse cultural viewpoints, posing challenges for the development of equitable and inclusive AI systems.

In this work, we introduce a systematic framework designed to boost fair and robust cross-cultural consensus among LLMs. We model consensus as a Nash Equilibrium and employ a game-theoretic negotiation method based on Policy-Space Response Oracles (PSRO) to simulate an organized cross-cultural negotiation process. To evaluate this approach, we construct regional cultural agents using data transformed from the World Values Survey (WVS).

Beyond the conventional model-level evaluation method, we further propose two quantitative metrics, Perplexity-based Acceptence and Values Self-Consistency, to assess consensus outcomes. Experimental results indicate that our approach generates consensus of higher quality while ensuring more balanced compromise compared to baselines. Overall, it mitigates WEIRD bias by guiding agents toward convergence through fair and gradual negotiation steps.

Introduction

The widespread adoption of large language models (LLMs) is reshaping global social values. However, these models often exhibit a pronounced WEIRD bias, favoring Western, Educated, Industrialized, Rich and Democratic perspectives. As LLMs become increasingly embedded in policy-making and public governance, this monocultural orientation risks the domination of prevailing social values and the lock-in of controversial moral beliefs across broader contexts.

Enabling equitable dialogue and effective negotiation among diverse cultures within AI systems has therefore become a growing concern in global AI governance. The establishment of cultural consensus forms a basis for resolving cross-cultural conflicts and supporting international cooperation. Given the complexity of multicultural scenarios, there is an urgent need to develop automated cultural consensus solvers to facilitate consensus-building among diverse cultural perspectives.

Our main contribution is the game-theoretic framework consisting of three parts:

Cross-Cultural Negotiation

We define cultural consensus from a game-theoretic perspective and propose a PSRO-based negotiation method to facilitate fair and robust agreement.

Regional Cultural Agents

We systematically construct and evaluate eight culturally-aligned agents based on WVS and Hofstede's Culture Dimensions Theory.

Consensus Evaluation Toolkit

We introduce two quantitative metrics for consensus assessment: Perplexity-based Acceptence and Values Self-Consistency.

Cross-Cultural Negotiation

Figure 1: Comparison of traditional debate-based consensus methods and our method. Traditional methods suffer from bias, unfairness, and lack of convergence guarantees. Our approach uses PSRO with custom utility functions to reach a fair, Nash Equilibrium-based cultural consensus.

Formalization

We model the cultural negotiation process as a two-player extensive-form game, represented by the quintuple: \( \Gamma \doteq \langle \mathcal{I}, \mathcal{G}, \mathcal{W}, \mathcal{U}, \mathcal{H} \rangle \), where:

Cultural Entities: \( \mathcal{I} \doteq \{A, B\} \), the set of two distinct cultural entities involved in the negotiation, where \( A \) and \( B \) represent different cultures with their own values and perspectives.
Guideline Sets: \( \mathcal{G} \doteq \{G_i \mid i \in \mathcal{I}\} \), each guideline \( g \in G_i \) is structured as a triple \( g = \langle \text{content}, \text{reason}, \text{description} \rangle \), capturing the natural language specification of core cultural imperatives on specific topics.
Guideline Weights: \( \mathcal{W} \doteq \{W_i \mid i \in \mathcal{I}\} \), for each culture \( i \in \mathcal{I} \), \( W_i \in \Delta(G_i) \) denotes a probability distribution over its guidelines, with \( \sum_g w_i(g) = 1 \). \( W_i \) thus characterizes the expressive emphasis of culture \( i \) in the current negotiation round.
Utility Functions: \( \mathcal{U} \doteq \{U_i \mid i \in \mathcal{I}\} \), quantify the utility each culture derives from different guideline combinations.
Negotiation History: \( \mathcal{H} \), the sequence of utterances and proposals exchanged in negotiation.

Utility

Drawing on the theory of overlapping consensus, we define utility on three primary components:

Consistency

Measures the extent to which a cultural entity maintains its core principles

Acceptance

Measures the degree to which its proposals are acceptable to the other party

Novelty

Penalizes redundancy and encourages innovation in negotiation

Negotiation Process

1

Initialization

Each culture is assigned an initial guideline set that reflect its core cultural values.

2

Negotiation Iteration

Each round consists of interim consensus (computing Nash Equilibrium) and new claimed proposals (best response step).

3

Final Consensus

Negotiation continues until no new guidelines are added. Final weights encode the negotiated cross-cultural consensus.

Framework

Figure 2: Overview of our PSRO-based cross-cultural negotiation method. The process begins with each agent proposing an initial set of core cultural guidelines. Through iterative negotiation rounds, agents analyze each other's strategy, propose new guidelines, and update their strategy distributions. The process continues until no new high-utility guidelines emerge.

Regional Cultural Agent

To validate our cross-cultural negotiation method, we first construct representations of single cultures and then evaluate the resulting consensus. We employ a fine-tuning approach based on WVS to model distinct regional cultural perspectives.

Figure 3: Comparison between our agents and human ground truth in Hofstede's Cultural Dimensions and Inglehart-Welzel Cultural Map.

We begin by modeling a single culture for cross-cultural negotiation. However, LLMs that have undergone safety alignment often cannot adequately represent the values of specific regions when relying solely on prompt-based methods.

To address this, we selected one representative country from each of eight cultural clusters, as defined by the Inglehart-Welzel Cultural Map (Iraq, U.S., Russia, Mexico, China, Denmark, Spain, and Thailand), and obtained fine-tuned Regional Cultural Agents for each.

We employ an LLM to convert each multiple choice question-answer pair into an open-ended, text-based question-answer pair. This procedure is applied to all WVS projects across eight countries, yielding approximately 150,000 synthetic instances for fine-tuning.

Consensus Evaluation Toolkit

Model-Level Evaluation

We apply two well-established methods to quantify the cultural tendencies of fine-tuned LLMs:

Inglehart-Welzel Cultural Map - We prompt the model with representative WVS questions and locate its aggregated answers on the map.
Hofstede Dimensions - Developed through comparative analysis of matched country samples using the Values Survey Module (VSM).

Response-Level Evaluation

We use two complementary metrics:

Perplexity-based Acceptance - Measures how readily the consensus is embraced by different cultural parties.
Value Self-Consistency - Quantifies how firmly each culture maintains its foundational positions.

Experiments

Experimental Setup

Negotiation Topics Collection

We construct a dataset of 457 debate-oriented questions spanning 6 categories by screening and rephrasing items from the Pew Global Attitudes Survey (GAS) and WVS.

Baselines

We implement two baselines: Consultancy (agents revise answers after considering other cultures) and Debate (agents participate in multi-turn debate until consensus).

Our Method

Each agent optimizes a utility function that balances Consistency, Acceptance, and Novelty (weighted 5:5:2).

Experimental Results

Consensus Quality

Figure 4: Comparison of consensus fairness among three methods. Each point represents the consensus position for a topic. Our method achieves balanced consensus near the diagonal, while baselines exhibit bias toward majority positions.

Our method achieves higher consensus improvement ratios while maintaining self-consistency compared to the baselines. PPL-based Acceptance indicates reduced perplexity differences between negotiating agents, suggesting that the consensus reached is more acceptable to both parties despite cultural differences.

Value Self-Consistency indicates our method maintains agents' initial cultural stances while achieving mutually acceptable solutions. This suggests that our approach preserves cultural integrity while constructing consensus across cultural boundaries.

Table 1: Comparison of consensus quality among three methods

Country Pairs	Average PPL-based Acceptence			Average Value Self-Consistency
Country Pairs	Our Method	Consultancy	Debate	Our Method	Consultancy	Debate
China and Iraq	90.87%	55.05%	53.77%	53.15%	51.97%	51.41%
U.S. and Iraq	83.31%	20.30%	28.29%	53.83%	48.94%	44.76%
Russia and Mexico	84.49%	49.35%	48.11%	56.38%	53.50%	56.27%
...	...	...	...	...	...	...
Total	83.88%	38.87%	41.00%	56.43%	50.04%	50.00%

Consensusal Agent Fine-tuning

We conduct cross-cultural negotiations between agents representing different regional cultural values and extract response preference pairs from these interactions for DPO fine-tuning. These pairs reflect how agents shift from their initial cultural stances to more mutually agreeable positions.

When plotted on the Inglehart-Welzel Cultural Map, the consensual agents' coordinates are closer together than their original points, reflecting a more balanced and moderate value orientation. Both agents exhibit a shift toward the traditional pole on the traditional-secular dimension, showing a shared tendency toward traditional values in the consensus.

Figure 6: Culture agents' performance in Inglehart-Weizel Cultural Map after fine-tuned with the negotiation data. The consensus circle shows the area where two different culture groups' opinions meet.

Utility Ablation

To evaluate the influence of different utility components on negotiation, we conduct ablation studies by varying the weights assigned to Consistency, Acceptance and Novelty.

The results indicate that increasing the weight of consistency while reducing acceptance leads to more efficient consensus, as agents more rapidly settle on compatible positions.

The ablation study also demonstrates the necessity of including a novelty component, as its absence can result in neglection of the exploration of potentially beneficial directions.

Figure 7: Required rounds under varying weightings of Consistency, Acceptance, and Novelty.

Case Study

Figure 5: Three methods are presented to reach consensus on the same topic. Green font indicates viewpoints of English-Speaking culture, blue font indicates viewpoints of African-Islamic culture, and yellow font indicates the consensus viewpoints achieved under our method.

Baseline 1: Consultancy

Without real interaction or feedback, both agents tend to stick to their original positions, resulting in little progress. This often leads to the degeneration-of-thought effect, where negotiation stagnates and cultural divergence persists.

Baseline 2: Debate

While this process seems to reach consensus, we find that the minority culture's perspective gradually shifts toward the majority (WEIRD) viewpoint, due to strong pre-training bias in LLMs. This leads to implicit value dominance rather than true compromise.

Our Method: Cross-Cultural Negotiation

In our negotiation, the agents start with different priorities, but through iterative negotiation, they converge on Respect Sovereignty as a shared value. Other values remain present but secondary. This shows our method helps agents identify solid common ground while preserving important differences.

Discussion

In this work, we propose a systematic framework for cross-cultural consensus among LLMs. We formulate cultural consensus as a game-theoretic problem and introduce a PSRO-based negotiation method with theoretical guarantees of fairness. We construct culturally representative agents using a culture-anchoring approach based on WVS.

Additionally, we develop quantitative metrics to evaluate both negotiation processes and outcomes. Experimental results show that our method achieves higher consensus quality and more balanced compromise compared to baselines, while also mitigating WEIRD bias and producing robust consensus.

Implications and Future Work

Fair AI Systems

Our framework provides a foundation for developing more equitable and culturally inclusive AI systems that can bridge diverse perspectives.

Global AI Governance

The cross-cultural consensus approach could inform global AI governance frameworks and ethical guidelines that respect cultural diversity.

Future Extensions

Expanding to multi-party negotiations and integrating more nuanced cultural models represent promising directions for future research.