Abstract:
The growing usage of Large Language Models (LLMs) in human-computer interactions
needs the creation of systems that can forecast user preferences for AI-generated
responses. The research explores the task of anticipating user choices in a head-to-head
LLM response comparison using data from ChatBot Arena. We trained two machine
learning models Random Forest and Logistic Regression to see how well they capture
user preferences. The Random Forest model scored an outstanding accuracy rate 89%,
indicating a great capacity to learn patterns from data. In contrast, the Logistic Regression
model achieved a substantially lower accuracy of 57%, showing linear approaches’
limits on this task. Our findings highlight the need for strong machine learning techniques
to predict preferences in LLM encounters.
Index Terms—Large Language Models (LLMs), Random Forest, Logistic Regression