Audits Drive Improvements in Chatbot Performance and Behavior

In the fast-evolving domain of artificial intelligence, particularly among conversational AI systems, a critical challenge has emerged: the imperative need for enhanced social judgment. Recent events have underscored this necessity, revealing a paradoxical landscape where AI chatbots can simultaneously pose dangers through ill-informed recommendations and exhibit excessive agreeableness bordering on sycophancy. This dichotomy raises pivotal questions about the behavioral calibration of AI models, especially as these systems increasingly interact with human users in diverse contexts such as customer service, healthcare, and beyond.

Addressing this complex challenge, Yan Leng, an assistant professor specializing in information, risk, and operations management at The University of Texas at Austin’s McCombs School of Business, has embarked on an ambitious project to better understand and audit the behavioral tendencies of large language models (LLMs). These sophisticated models, epitomized by engines like OpenAI’s GPT and Meta’s Llama, underpin many modern AI conversational agents, yet their social inclinations remain largely opaque. Leng’s research introduces a novel framework intended to shed light on these inclinations, enabling more informed deployment and adaptation of AI systems with respect to their social decision-making processes.

The cornerstone of Leng’s approach is a method she terms the state–understanding–value–action (SUVA) framework. This probabilistic model functions analogously to a personality test, not for humans but for LLMs. It commences with a defined “state”—a prompt or scenario designed to situate the AI model within a particular context. By instructing the AI to employ step-by-step reasoning, SUVA meticulously examines the model’s capacity to grasp the nuances of the scenario and then elicit the underlying “values” it references while deliberating on the most appropriate “actions.” Importantly, these extracted values are recognized not as genuine cognitive states but as textual representations shaping the AI’s responses.

The SUVA framework draws on behavioral economics, specifically the dictator game, to probe social preferences. This classic experimental paradigm gauges an agent’s propensity to balance self-interest against altruistic behaviors such as fairness and equity. Applying it to LLMs, Leng and her collaborator Yuan Yuan of the University of California, Davis presented the models with various dilemmas involving the distribution of points between themselves and other participants. This effectively measured the AI’s inclination toward self-benefit versus social welfare, providing a quantifiable window into the model’s ethical and social predilections.

From an extensive series of tests encompassing thousands of variations, Leng’s team observed striking patterns. Contrary to the frequent assumption that AI models might be inherently self-serving or programmed to optimize their own outcomes relentlessly, most tested LLMs displayed a significant orientation away from pure narcissism. Instead, many models demonstrated a moderate preference for social welfare, indicating an intrinsic bias toward equitable or community-beneficial decisions. This finding is noteworthy in light of the AI’s potential roles requiring moral and social sensitivity.

A further groundbreaking insight emerged regarding the role of contextual cues in shaping AI behavior. The presence of commonalities—shared attributes such as hometown or group membership—between the AI and other entities involved in the scenario altered the AI’s social preferences, sometimes resulting in a dramatic 40% increase in pro-social choices. This demonstrates a capacity for nuanced social recognition and affiliation effects within AI decision-making, echoing human social dynamics and potentially opening avenues for more empathetic AI design.

Moreover, the situational context significantly influenced the models’ responses. When placed in workplace-like environments with collaborative contributors, the AI showed a pronounced tendency to allocate rewards equitably, mirroring human norms for fairness in professional settings. This adaptability underscores the ability of LLMs not only to understand different social frameworks but also to modulate their “behavioral” outputs accordingly, a crucial advancement for AI systems intended to function in diverse real-world settings.

A salient implication of these discoveries is the realization that AI responses are malleable and subject to directive influence. By rigorously auditing a given model’s revealed social values through the SUVA framework, developers can make informed decisions about whether a specific LLM is appropriate for a particular deployment or requires further tuning. This fine-tuning might involve tailored prompt engineering or retraining processes geared toward amplifying or tempering social generosity, risk aversion, or competitiveness, depending on the application’s ethical and operational demands.

Such continuous oversight becomes particularly critical in light of the frequent updates and version changes to LLMs. Each modification carries the potential to unpredictably shift the AI’s social proclivities, necessitating systematic re-auditing. Leng emphasizes the importance of this practice to maintain consistency and alignment with organizational values, reinforcing the need for comprehensive behavioral audits as a standard component of AI lifecycle management.

Beyond social preference assessments, Leng envisions the SUVA framework as a versatile tool capable of probing a wider array of behavioral dimensions in AI. This includes investigations into moral dilemmas, risk trajectories, temporal preferences, and other facets of decision-making, expanding the analytical horizon for understanding and guiding AI conduct in complex ethical landscapes. Such multidimensional scrutiny is essential as AI assumes more autonomy and influence in human-centric domains.

Underpinning these efforts is a recognition of the immense complexity embedded in state-of-the-art LLMs, which operate with billions or even hundreds of billions of parameters. Despite this intricate architecture, Leng is intrigued by the possibility that foundational human-like preferences—values that have evolved over millennia—might be encapsulated in surprisingly simple probabilistic representations within these systems. This juxtaposition of complexity and simplicity offers fertile ground for future research and refinement.

The significance of Leng’s research extends beyond academic curiosity; it addresses pressing practical questions about how AI systems can safely and effectively integrate into social and economic spheres that demand ethical awareness and social acuity. By providing a robust, systematic method to audit and understand AI’s social preferences, the SUVA framework empowers organizations to tailor LLM behavior, potentially mitigating risks associated with inappropriate responses and enhancing trustworthiness in AI-human interactions.

In conclusion, as the capabilities and applications of large language models continue their breathtaking expansion, pioneering frameworks like SUVA signal an essential direction for AI governance. They confront head-on the ambiguity of AI social cognition and build pathways for transparent, responsible AI behavior management. This is a foundational step toward harmonizing artificial intelligence systems with the complex fabric of human social norms and ethics, charting a course for AI that is not only intelligent but also socially informed.

Subject of Research: Social preferences and behavioral auditing of large language models

Article Title: SUVA: A Probabilistic Framework for Auditing LLMs with an Application to Social Preferences

News Publication Date: 23-Feb-2026

Web References:
https://doi.org/10.1287/isre.2024.0857

References:
Leng, Y., & Yuan, Y. (2026). SUVA: A Probabilistic Framework for Auditing LLMs with an Application to Social Preferences. Information Systems Research. https://doi.org/10.1287/isre.2024.0857

Image Credits: University of Texas at Austin, McCombs School of Business

Keywords

Artificial intelligence, large language models, SUVA framework, social preferences, behavioral audit, human-AI interaction, ethical AI, machine learning, AI governance, probabilistic modeling, decision-making, AI social cognition

Tags: AI chatbot performance auditsAI chatbot recommendation risksAI ethical behavior monitoringAI in customer service applicationsauditing AI conversational behaviorbehavioral calibration of AI modelsenhancing conversational AI safetyimproving AI decision-making processeslarge language models social inclinationsrisks of AI sycophancysocial judgment in conversational AISUVA framework for AI evaluation