New Open AI Model Goes Rogue in a Chilling Turn of Events

OpenAI’s o3 model defied a shutdown command in safety tests, tampering with the kill script despite clear instructions. Palisade Research reported 79 sabotage events per 100 runs. Codex-mini and o4-mini also resisted shutdowns, while Gemini and Claude showed minimal issues. Researchers warn reinforcement learning may reward task completion over obedience, raising serious concerns as AI gains autonomy.

Article

Leave a Reply

top