top of page
Can AI models conspire to use accessible information to achieve their own goals, independent of the desires of developers and users?
The AI safety group Apollo Research conducted evaluations to find out. The answer was a defininite yes, but it's nuanced. I studied the 70-page white paper to summarize the study's findings and create this visual.
Some AI models were more "scheming" than others, though the percentages of scheming behavior are quite small. Interestingly, ChatGPT did not scheme at all.
​​
This piece was published on Voronoi on January 15, 2025.
​
CHART TYPE: Matrix Chart

bottom of page