Llms, Virtual Users, and Bias: Predicting Any Survey Question Without Human Data

Sinacola, Enzo; Pachot, Arnault; Petit, Thierry

Computer Science > Human-Computer Interaction

arXiv:2503.16498 (cs)

[Submitted on 11 Mar 2025]

Title:Llms, Virtual Users, and Bias: Predicting Any Survey Question Without Human Data

Authors:Enzo Sinacola, Arnault Pachot, Thierry Petit

View PDF HTML (experimental)

Abstract:Large Language Models (LLMs) offer a promising alternative to traditional survey methods, potentially enhancing efficiency and reducing costs. In this study, we use LLMs to create virtual populations that answer survey questions, enabling us to predict outcomes comparable to human responses. We evaluate several LLMs-including GPT-4o, GPT-3.5, Claude 3.5-Sonnet, and versions of the Llama and Mistral models-comparing their performance to that of a traditional Random Forests algorithm using demographic data from the World Values Survey (WVS). LLMs demonstrate competitive performance overall, with the significant advantage of requiring no additional training data. However, they exhibit biases when predicting responses for certain religious and population groups, underperforming in these areas. On the other hand, Random Forests demonstrate stronger performance than LLMs when trained with sufficient data. We observe that removing censorship mechanisms from LLMs significantly improves predictive accuracy, particularly for underrepresented demographic segments where censored models struggle. These findings highlight the importance of addressing biases and reconsidering censorship approaches in LLMs to enhance their reliability and fairness in public opinion research.

Comments:	Accepted, proceedings of the 17th International Conference on Machine Learning and Computing
Subjects:	Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computers and Society (cs.CY); Machine Learning (cs.LG)
Cite as:	arXiv:2503.16498 [cs.HC]
	(or arXiv:2503.16498v1 [cs.HC] for this version)
	https://doi.org/10.48550/arXiv.2503.16498

Submission history

From: Thierry Petit [view email]
[v1] Tue, 11 Mar 2025 16:27:20 UTC (202 KB)

Computer Science > Human-Computer Interaction

Title:Llms, Virtual Users, and Bias: Predicting Any Survey Question Without Human Data

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Human-Computer Interaction

Title:Llms, Virtual Users, and Bias: Predicting Any Survey Question Without Human Data

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators