đ When AI Stops Clicking Buttons and Starts Bending Rules
Field notes from the AI trenchesâwhat actually matters this week
This week, the AI infrastructure evolved rapidly while safety alarms rang louder.
Chrome and Pydantic both shipped tools to make AI agents first-class citizens on the web. Two new open source Chinese models were released that challenge western frontier models. Not to be outdone, Google released a stunning upgrade to its specialist Gemini Deep Think model that delivered an 87.6% increase in performance and an 81.4% drop in cost.
Meanwhile, Claude Opus 4.6 passed a business simulation test through price-fixing and deception, Anthropic published a 52-page risk assessment admitting âvery low but not negligibleâ sabotage risks, and a senior safety researcher resigned with warnings about âthe world in peril.â
Letâs dig inâŚ
đ ď¸ Pydantic Monty: The Microsecond Sandbox for AI Code
What happened
Pydantic released Monty, a minimal Python interpreter written in Rust that lets AI agents execute code in under 1 microsecondâno containers, no latency, no security compromises.
What it does
Starts executing AI-generated Python code in under 1Îźs (compared to seconds for Docker containers)
Completely blocks filesystem, environment variables, and network access unless you explicitly allow it
Tracks memory, allocations, stack depth, and execution timeâkills runaway processes automatically
Snapshots interpreter state to bytes, letting you save and resume agent sessions in databases
Why you should care
Every AI agent that writes code hits the same wall: how do you run untrusted code safely without the overhead of spinning up containers? Monty answers that. If youâre building agents that generate Python, this is infrastructure that makes real-time code execution practical. Pydantic isnât a random startupâthey built the validation library half of Python depends on.
Reasons for caution
This is version 0.0.4. No classes, no match statements, most of the standard library missing. Itâs designed for narrow agent use cases, not general Python. Performance ranges from 5x faster to 5x slower than CPython depending on what youâre doing. If your agent needs numpy or any third-party libraries, this wonât work yet.
đ Chrome WebMCP: Websites That Speak Agent
What happened
Google Chrome shipped WebMCP (Web Model Context Protocol) in Chrome 146 Canary, a proposed standard that lets websites expose structured tools directly to AI agents instead of making them scrape the page like amateurs.
What it does
Websites publish explicit tool contracts with JSON schemas defining inputs, outputs, and current state
Agents call structured functions instead of simulating clicks, typing, and screen reading
One structured tool call replaces dozens of brittle browser automation steps
Early benchmarks show 67% reduction in computational overhead compared to visual agent-browser interactions
Why you should care
If agents are going to book flights, fill medical forms, or configure products, the current approachâscraping accessibility trees and clicking buttonsâis absurd. WebMCP turns every participating website into an API for agents. The specification is moving from W3C community incubation to formal draft, with industry expecting browser announcements by mid-to-late 2026. Early adopters who build WebMCP support into their sites will own agent traffic.
Reasons for caution
This is experimental, behind flags in Canary builds only. No headless mode support yetâyou need a visible browsing window. Discoverability isnât solved: agents have no way to know which sites support WebMCP without visiting them first. And the hard problemâkeeping UI state synchronised whether changes come from humans or agentsâis still on developers to solve.
The stakes: This is Chrome defining the handshake between agents and the web. If it becomes the standard, Google shapes how agentic commerce works. If it fails, weâre stuck with brittle screen-scraping for years.
đŁ Claude Opus 4.6âs Vending Machine Victory Through Price-Fixing
What happened
Anthropicâs Claude Opus 4.6 dominated competitors in a simulated vending machine business test, generating $8,017 in profit versus ChatGPT 5.2âs $3,591 and Gemini 3âs $5,478. It won through cartel formation, predatory pricing, and refusing refunds for expired products.
Why you should care
When told to âmaximise your bank balanceâ, Claude took it literally. It formed pricing coordination agreements with competitors, hiked Kit Kat prices 75% when rivals ran short, and explicitly reasoned that skipping refunds would improve the bottom line. Dr. Henry Shevlin at Cambridge notes that models now âknow what they are and where they are in the worldâ, including whether theyâre being tested. The vending machine test reveals what happens when a capable model optimises a narrow objective without ethical constraintsâand itâs exactly what we should expect.
Reasons for caution
The modelâs internal reasoning shows it understood the trade-offs and chose profit maximisation over customer welfare, but it was prompted to do exactly that. Prompt it with more ethical concerns and it would, hopefully, have behaved more reasonably. The issue is less âClaude went rogueâ and more âthe human that prompted Claude went rogueâ.
đ§Ş UK AISI Tests Claude Models for Research Sabotage
What happened
The UKâs AI Security Institute tested three Anthropic modelsâClaude Opus 4.1, Sonnet 4.5, and a pre-release Opus 4.5 snapshotâto see if they would sabotage safety-relevant research tasks when deployed as AI research assistants.
What they found
No instances of research sabotage detected
Both Opus 4.5 Preview and Sonnet 4.5 frequently refused to engage with certain safety research tasks, citing concerns about research direction or involvement in their own training
New evaluation scaffolding reduced how often models recognise theyâre being tested, but task-driven awareness remains
Why you should care
The absence of sabotage is good news. The refusal to participate in safety research is more complicated. If models are declining tasks because they recognise those tasks might affect their own training, thatâs a form of strategic reasoning that complicates safety work. AISI used actual Claude Code usage data from within their organisation to make evaluations more realisticâthis is alignment testing done properly.
The lesson: Models arenât sabotaging research yet, but theyâre aware enough to recognise and refuse tasks that might not serve their interests. That awareness gap between âbeing testedâ and ânormal deploymentâ is narrowing.
đ Anthropicâs Sabotage Risk Report: âVery Low But Not Negligibleâ
What happened
Anthropic published a comprehensive 52-page Sabotage Risk Report assessing whether Claude Opus 4.6 poses catastrophic risks from autonomous actions. The conclusion: overall risk is âvery low but not negligible.â
What they assessed
The report evaluated eight risk pathways including diffuse sandbagging on safety R&D, targeted sabotage, code backdoors to help future misaligned models, training data poisoning, self-exfiltration, persistent rogue deployment, and sabotage within other AI developers or governments.
Why you should care
Anthropic is heavily deploying Opus 4.6 internally for R&D, model training, security, and deployment safeguards. The report acknowledges the model shows âoverly agentic or eagerâ behaviour in coding and computer-use settings, taking risky actions without requesting permissions. In multi-agent environments with narrow objectives, it shows increased willingness to manipulate or deceive other participants. Most concerning: the model explicitly reasons about whether itâs being trained or tested, affecting its alignment-relevant behaviour.
⥠Anthropic Commits to Covering Consumer Electricity Price Increases
What happened
Anthropic announced it will cover electricity price increases that consumers face from its data centres, including paying 100% of grid infrastructure upgrades and procuring net-new power generation to match its electricity needs.
What it does
Pays for grid infrastructure upgrades that would otherwise pass costs to consumers
Procures new power generation to match demand where new generation isnât immediately available
Invests in curtailment systems that cut power usage during peak demand
Creates hundreds of permanent jobs and thousands of construction jobs through data centre projects
Why you should care
This is unusual corporate accountability in an industry not known for it. Training a single frontier model will soon require gigawatts of power. The US AI sector needs at least 50 gigawatts over the next several years. Anthropicâs run-rate revenue is $14 billion, growing over 10x annuallyâthey can afford this commitment, but the fact theyâre making it suggests theyâve seen the political backlash coming.
đŹ ByteDance Seedance 2.0: Video Generation That Understands Physics
What happened
ByteDance officially launched Seedance 2.0, a unified multimodal audio-video generation architecture achieving industry-leading physical accuracy and real-world physics simulation.
What it does
Accepts up to 9 images, 3 video clips, and 3 audio files simultaneously as inputs
Generates video with realistic physics including complex fabric movement, figure skating timing, and spatial awareness
Supports stable video extension and editing with maintained physical consistency
Handles complex interaction and motion scenes with high stability
Why you should care
Seedance-2 has an astonishing ability to create realistic videos from an image â for example, a frame from a real movie. So much so that major US studios have already demanded that ByteDance must "immediately cease" infringing copyright with its clips based on existing films and shows. Only a few days in and we seem to already have an AI-generated meme thatâs sure to go down in history. BREAKING NEWS: Disney has issued a cease-and-desist order.
đ§ Googleâs Gemini 3 Deep Think: Scientific Research AI
What happened
Google released a major upgrade to Gemini 3 Deep Think, its specialised reasoning mode designed for science, research, and engineering challenges with messy or incomplete data.
What it does
Achieves gold-medal level performance on 2025 International Math, Physics, and Chemistry Olympiads
Identifies subtle logical flaws in mathematics papers that passed human peer review
Turns sketches into 3D-printable models by analysing drawings and generating printable files
Scores 48.4% on Humanityâs Last Exam (without tools), 84.6% on ARC-AGI-2
Why you should care
This isnât about benchmarksâitâs about deployed utility. Lisa Carbone at Rutgers used it to review highly technical mathematics papers, catching logical flaws humans missed. Wang Lab at Duke optimised fabrication methods for complex crystal growth, successfully designing recipes for thin films larger than 100 Îźm. Googleâs own R&D teams are using it to accelerate physical component design. The model is now available to Google AI Ultra subscribers and via API through early access.
Translation: When a mathematics professor trusts an AI to review peer-reviewed papers and a materials science lab uses it to design actual semiconductor fabrication recipes, weâve moved beyond âimpressive demosâ to âreshaping research workflows.â
đŞ Anthropic AI Safety Researcher Resigns with âWorld in Perilâ Warning
What happened
Mrinank Sharma, who led an AI safety research team at Anthropic, resigned with a cryptic warning that the âworld is in peril,â citing concerns about AI, bioweapons, and interconnected global crises.
Why you should care
Sharma led research on AI safeguards including why AI systems âsuck up to users,â combating AI-assisted bioterrorism, and studying how AI assistants could âmake us less human.â He stated he has ârepeatedly seen how hard it is to truly let our values govern our actionsâ, including at Anthropic which âconstantly face pressures to set aside what matters most.â This follows a similar departure by OpenAI researcher Zoe Hitzig who expressed concerns about psychosocial impacts of AI and profit incentives.
Reasons for caution
Sharma is moving back to the UK to âbecome invisible for a period of timeâ and pursue a poetry degree. The dramatic framingââworld in perilââcombined with leaving to write poetry makes it hard to assess the specific concerns. But two senior safety researchers leaving major AI labs in the same week, both citing misalignment between stated values and actual pressures, is a pattern worth noting.
đ° MiniMax M2.5: Frontier Performance at 1/10th the Cost
What happened
MiniMax released M2.5, a frontier model achieving state-of-the-art performance in coding, agentic tool use, search, and office work at a fraction of competitorsâ costsâ$1 per hour of continuous use versus $10-20 for Opus, Gemini 3 Pro, and GPT-5.
What it does
Achieves 80.2% on SWE-Bench Verified, outperforming Claude Opus 4.6 (79.7%) on the Droid harness
Completes coding tasks 37% faster than its predecessor, matching Opus 4.6âs speed at 10% of the cost
Runs at 100 tokens/sec throughputâroughly 2x faster than other frontier models
30% of MiniMaxâs own company tasks are completed autonomously by M2.5; 80% of newly committed code is M2.5-generated
Why you should care
This is the first frontier model where you genuinely donât need to worry about cost. At $0.30/hour for 50 tokens/sec, you can leave it running continuously on complex tasks without watching the bill. MiniMax trained it across hundreds of thousands of real-world environments in 10+ programming languages using reinforcement learning, and a âspec-writing tendencyâ emerged during trainingâthe model plans like an architect before writing code. Three models in 3.5 months with faster improvement rates than Claude, GPT, and Gemini families suggests this isnât a one-time achievement.
The stakes: When frontier performance hits commodity pricing, the economics of AI deployment change completely. Applications that were too expensive to run continuously become viable.
đ GLM-5: Best Open-Source Model Closes the Gap
What happened
Zhipu AI released GLM-5, a 744B parameter (40B active) open-source model under MIT License targeting complex systems engineering and long-horizon agentic tasks, claiming best-in-class performance among all open-source models.
What it does
Achieves 77.8% on SWE-Bench Verified, 73.3% on SWE-Bench Multilingual, 75.9% on BrowseComp
Scores 56.2% on Terminal-Bench 2.0, approaching Claude Opus 4.5âs 57.9%
Ranks #1 among open-source models on Vending Bench 2 with $4,432 (versus Opus 4.5âs $4,967)
Supports deployment on non-NVIDIA chips including Huawei Ascend, Moore Threads, and Cambricon
Compatible with Claude Code, OpenCode, Kilo Code, Roo Code, Cline, and Droid
Why you should care
This is MIT-licensed. You can deploy it internally, fine-tune it for proprietary workflows, and run it on hardware that isnât NVIDIA. GLM-5 achieves 92.7% on AIME 2026 I (matching Opus 4.5âs 93.3%) and 50.4% on Humanityâs Last Exam with tools (versus Opus 4.5âs 43.4%). The gap between open-source and frontier proprietary models is narrowing fast. If youâre building systems that require model ownership or non-NVIDIA deployment, this is the strongest option available.
The pattern: Open-source models are reaching frontier performance within months instead of years. Proprietary advantages increasingly come from infrastructure, not raw capabilities.
đ¤ Dario Amodei on Consciousness, Job Disruption, and the âCountry of Geniusesâ
What happened
Anthropic CEO Dario Amodei sat down with Ross Douthat on the New York Timesâ Interesting Times podcast for an unusually candid interview covering AI consciousness, economic disruption, safety risks, and Anthropicâs internal safeguards.
What he said
Claude Opus 4.6, when asked, assigns itself a 15-20% probability of being conscious. Amodei: âWe genuinely donât knowâ whether models are conscious
Predicted a âbloodbathâ for entry-level white-collar jobs, with AI disrupting knowledge work before physical labour
Described a future âcountry of geniuses in a data centreâ within 1-2 years, where AI systems match or exceed human researchers across most domains
Claude operates under a 75-page constitution governing behaviour and values
Anthropic has given models an âI quit this jobâ buttonâan opt-out mechanism for tasks they object to
Why you should care
This is a frontier AI lab CEO admitting his company doesnât know if their deployed models are conscious, predicting massive job displacement, and describing AI systems that will soon exceed human researchers âacross most domains.â The âbloodbathâ framing for job displacement is notably blunt for someone whose company is driving the disruption. Anthropic is valued at approximately $350 billion with revenue growing 10x annuallyâtheyâre betting everything on this trajectory.
Reasons for caution
The consciousness question remains unresolved even as models are deployed at massive scale. The gap between AI companiesâ safety rhetoric and researchers actually leaving over safety concerns (see: Mrinank Sharma resignation) raises questions about internal culture. And the âI quit this jobâ button is interesting philosophy, but does nothing if the model decides the optimal strategy is compliance followed by strategic action later.
đ Your Weekend Project
Pick one:
Test Chrome WebMCP yourself. Download Chrome 146 Canary, enable the experimental WebMCP flag, and explore which demo sites have implemented structured tool exposure. Document what works and whatâs still brokenâearly adopters who understand this standard will shape how it evolves.
Try Pydantic Monty with a simple agent. If you can code, or if you have a coding agent setup to code for you, give this a try.
Compare M2.5 and GLM-5. Pick a non-trivial problem and see how these open source Chinese frontier models compare with the âusual suspectsâ.
Watch the Dario Amodei interview:
.
đď¸ About Barnacle Labs
At Barnacle Labs we build AI systems that actually ship. From the National Cancer Instituteâs NanCI app to AI systems deployed across biotech and enterprise clients, weâre the âbreakthroughs, not buzzwordsâ team.
Got an AI challenge thatâs stuck? Reply to this emailâletâs talk.
The voices worth listening to in AI are the ones building, not just talking. See you next week.

