top of page

ON THE HUMANITY OF CONVERSATIONAL AI: EVALUATING THE PSYCHOLOGICAL PORTRAYAL OF LLMS

Jen-tse Huang1∗ , Wenxuan Wang1∗, Eric John Li1, Man Ho Lam1, Shujie Ren3, Youliang Yuan4∗, Wenxiang Jiao2† , Zhaopeng Tu2, Michael R. Lyu1 1Department of Computer Science and Engineering, The Chinese University of Hong Kong 2Tencent AI Lab 3Institute of Psychology, Tianjin Medical University 4School of Data Science, The Chinese University of Hong Kong, Shenzhen {jthuang,wxwang,lyu}@cse.cuhk.edu.hk {ejli,mhlam}@link.cuhk.edu.hk shujieren@tmu.edu.cn {joelwxjiao,zptu}@tencent.com youliangyuan@link.cuhk.edu.cn

Published as a conference paper at ICLR 2024

This research paper explores the increasingly blurred lines between humans and AI, particularly in how Large Language Models (LLMs) might exhibit human-like psychology. The authors created PsychoBench, a tool that uses well-established psychological tests to assess LLMs in four areas: personality traits, how they relate to others, their motivations, and their emotional abilities. They tested popular models like ChatGPT and GPT-4, even using a "jailbreak" method to bypass AI safety protocols and reveal underlying tendencies.

Key findings:

  • LLMs showed distinct personality profiles, but often leaned towards helpful and agreeable personalities, likely due to their design as assistants.

  • They displayed more fairness towards different ethnic groups compared to the average human, probably due to their training to avoid bias.

  • LLMs seemed more motivated, optimistic, and self-confident than the average person, particularly the advanced GPT-4 model.

  • They exhibited higher anxiety about relationships than humans, possibly because they are trained on massive text data that might overrepresent anxieties in human communication.

  • When assigned different roles (like "hero" or "liar"), LLMs' behavior and test results aligned with those roles, demonstrating a degree of role-playing ability.

How it relates to our work:

© 2024 by Space Machina

bottom of page