AI Models are Capable of Scheming

Can AI models conspire to use accessible information to achieve their own goals, independent of the desires of developers and users? The AI safety group Apollo Research conducted evaluations to find out. The answer was a definite yes, but it's nuanced. I studied the 70-page white paper to summarize the study's findings and create this visual. Some AI models were more "scheming" than others, though the percentages of scheming behavior are quite small. Interestingly, ChatGPT did not scheme at all.

CHART TYPE :

SUBJECT CATEGORY 1 :

SUBJECT CATEGORY 2 :

DATE PUBLISHED :

EXTERNAL LINKS :

Matrix Chart

AI

Tech

January 15, 2025

Voronoi: https://www.voronoiapp.com/technology/AI-Models-are-Capable-of-Scheming-3703