Sebrona · Blog

Field notes. No hype.

We run a daily AI radar inside JARVIS: Hacker News, Hugging Face papers, releases from Anthropic, OpenAI, Google DeepMind, Mistral and Meta, the labs we follow on X, arXiv preprints, GitHub trending, vendor changelogs. Not a public newsletter yet. One filter: does it touch the work we are paid to do? What passes becomes an ADR, a runbook entry, or a routing change. The blog is what we read on the way.

Field notes18 Jun 202611 min·By Miroslav Striško

The Week a Frontier Model Vanished in 72 Hours

Anthropic put its most capable model into general release on a Tuesday. A US government order pulled it that Friday. The takeaway has little to do with Fable itself: anything wired to a single closed API is now one directive away from going dark, and the open-weight bench has grown deep enough that you don't have to build that way.

Read the post →

Field notes1 Jun 202613 min·By Miroslav Striško

How to Build Secure AI Agents Without Sending Data to the Cloud

At Computex, NVIDIA shipped a credible answer at every layer of running autonomous agents on hardware you control — silicon, OS containment, the OpenShell runtime, an open-weight model, and the agent frameworks. The headline is RTX Spark. The part that changes procurement is that agent governance just became a free, standardised commodity — which moves the durable work up to policy, workflow, and GDPR-grade audit.

Read the post →

Field notes29 May 20265 min·By Miroslav Striško

Claude Opus 4.8: Why Reliability Matters More Than Benchmarks

Anthropic shipped Opus 4.8 at the same price as 4.7 and called it “incremental.” But the model is roughly four times less likely to lie about its own work — and for production AI, that reliability gain matters more than the version number suggests.

Read the post →

Field notes27 May 202610 min·By Miroslav Striško

What MiniMax M2 Means for Private Enterprise AI Deployments

MiniMax shipped an open-weight MoE that activates 9.8B of its 229.9B parameters per token and scores within four points of GPT 5.4 on SWE-bench Pro. Three numbers in our procurement spreadsheet change this week; a fourth thread we are still watching.

Read the post →

Field notes25 May 202612 min·By Miroslav Striško

Why AI Agent Design Is Changing Faster Than Most Teams Realize

Three threads worth acting on this week: KV cache reuse as the cost lever everyone keeps rediscovering, agent skills replacing the tool call as the unit of agent design, and performance forecasting becoming an exercise we can defend with numbers. The fourth, RL credit assignment, is the one we're watching but not acting on yet.

Read the post →