Is GPT-5.1 Really an Upgrade? But Models Can Auto-Hack Govts, so … there’s that
By AI Explained
Community Score: 50% | 61.9K views | 4mo
0 community ratings: null thumbs up, null thumbs down
A lot just got released in the last 36 hours, and it will all affect hundreds of millions of people. 10 details you would miss if you just read the headlines, from GPT 5.1 regressions, to how Claude hacked Govt Agencies, to SIMA 2, and Musical Turing Tests. https://assemblyai.com/aiexplained Chapters: 00:00 - Introduction 00:56 - GPT 5.1 Smarter? 01:47 - Some Regressions 03:22 - Sycophancy? 05:22 - Claude Auto-Hacking 06:16 - Jailbreaking through Granularity 08:22 - This Will be Re-used 09:30 - Hallucinating Hacker 09:57 - Surprisingly Neutral Tone 12:18 - SIMA 2 14:10 - Alpha Parallels 17:24 - AI Music AI Insiders ($9!): https://www.patreon.com/AIExplained GPT 5.1 Announcement: https://openai.com/index/gpt-5-1/ System Card: https://cdn.openai.com/pdf/4173ec8d-1229-47db-96de-06d87147e07e/5_1_system_card.pdf Benchmarks: https://openai.com/index/gpt-5-1-for-developers/ Simple Bench: https://lmcouncil.ai/benchmarks Auto-Hacking: https://x.com/AnthropicAI/status/1989033793190277618
More from AI Explained
- Gemini 3.1 Pro and the Downfall of Benchmarks: Welcome to the Vibe Era of AI — Score: 50%
- The Two Best AI Models/Enemies Just Got Released Simultaneously — Score: 50%
- OpenAI Tests if GPT-5 Can Automate Your Job - 4 Unexpected Findings — Score: 50%
- Claude AI Co-founder Publishes 4 Big Claims about Near Future: Breakdown — Score: 50%
- Anthropic: Our AI just created a tool that can ‘automate all white collar work’, Me: — Score: 50%
- What the Freakiness of 2025 in AI Tells Us About 2026 — Score: 50%