Evaluating LLMs for Credible and Rigorous Social Science Research
Michael Overton, PhD, University of Idaho
Dr. Michael Overton,
Associate Professor of Political Science and Public Administration
University of Idaho
REGISTER
Social scientists are using large language models (LLMs) to classify documents, extract data from records, and analyze texts at scales that were once impossible. However, the methods for evaluating these tools haven’t caught up as findings routinely rest on a single run of a single prompt, with no way to tell whether the same analysis would return the same answers tomorrow. There is a way forward by treating LLMs as measurement instruments, which are subject to the same standards of validity and reliability that social scientists already apply to surveys, coding schemes, and experimental protocols. This presentation covers the theoretical foundations, evaluation criteria, and methods for establishing rigorous and credible social science using LLMs.
Part of the Computational Social Science Series.CS3 sponsored events are open for full participation by all individuals regardless of race, color, religion, sex, national origin, or any other protected category under applicable federal law, state law, and the University’s nondiscrimination policy.