⚖️ First-Person Fairness in Chatbots
This paper from OpenAI examines potential bias in chatbot systems like ChatGPT, specifically focusing on how a user's name, which can be associated with demographic attributes, influences the chatbot's responses. The authors propose a privacy-preserving method to measure user name bias across a large dataset of real-world chatbot interactions. They identify several instances of bias, demonstrating that chatbot responses can show a tendency towards creating protagonists whose gender matches the user's likely gender and that users with female-associated names receive responses with friendlier and simpler language more often. The study also finds that post-training interventions like reinforcement learning can significantly mitigate harmful stereotypes.
📎
Link to paper🌐
Read their blog