№ 00 · AI architecture & implementation · Bratislava№ 00 · AI architecture & implementation · Bratislava

Calling the model is one line.We build everything around it.

One architect designs, builds, and runs the production system around that call — and stays on-call for 30 days after cutover.

Sebrona is a Bratislava-based AI consulting and software-engineering practice building production AI systems for European mid-market and enterprise clients.

Book a 30-min architecture call Read the 1‑page brief

ARCHITECT

One architect · brief to handover

EVAL SET

50–600 prompts per route · red build blocks merge

PAGED ON-CALL

30 days post-cutover · we hold the pager

MODEL FLOOR

Claude Opus 4.8 · Mistral Small 3.2 (Apache 2.0) · routing policy in week 1

CAPABILITY

p95 budget < 600ms TTFT · 5-min Anthropic cache · backup-model route

NOT FOR

horizontal SaaS subscriptions · twenty-engineer war rooms · six-week throwaway POCs

№ 01 · Capabilities

Four practices.
One architect runs all of them.

We build custom AI systems for companies, the kind you can't buy off the shelf. We automate manual, repetitive work. And when no existing product fits how a business runs, we build the software that does. One architect owns each engagement from the first call to handover.

Not on the menu: horizontal SaaS subscriptions, twenty-engineer war rooms, or six-week throwaway POCs.

№01
Custom AI systems
We build AI systems that work from your own data and run in production.
It’s software that works from your own records (orders, prices, products, history), not the public internet, so the answers fit your business. For example, a sales rep asks in plain language “what did this customer buy last quarter, and what’s their agreed price?” and gets it straight back, instead of digging through your old ERP. It replaces the daily hunt for things the company already knows, and it runs in everyday use on your own systems, not a demo.
Hybrid retrieval (BM25 + Voyage-3 dense embeddings, Cohere rerank-3 on top-50) over your own corpus. The Cohere rerank is opt-in, and sovereignty-strict corpora swap Voyage-3 for a local BGE-M3. Tool-using agents over Model Context Protocol (MCP) servers with typed contracts. Private inference on open-weight models (Mistral Small 3.2 / Qwen 3, Apache-2.0) where the data is too sensitive to leave the building. A 50–600-prompt gold set gates every prompt route through Promptfoo CI. Red eval blocks merge. ADRs for major decisions. Every call carries a full OpenTelemetry trace. Cutover on a tested rollback path: if shadow traffic flags a regression we don’t switch.
Stack
- Hybrid retrieval · BM25 + Voyage-3
- Tool-using agents · MCP
- Eval harness · CI-gated
- Local + hosted routing
№02
Process automation
Repetitive, rule-based work: document review, data extraction, reporting. Moved onto software, with people kept on the exceptions.
We take the work your team repeats the same way every day and give it to software, keeping people on the cases that need a decision. For example, instead of a buyer checking every SKU each morning (what sold, what to reorder), the system learns from past orders, predicts demand, and prepares the order; they set it up once, then only get notified for the exceptions, like a sudden spike or a supplier price change. The routine clears off the desk, so the hours go to judgment, not re-keying.
Document review, data extraction, report generation, lead qualification: the work an analyst spends thirty hours a week on. Agents and n8n workflows take the routine path; the human stays on the edge cases. Where the shape fits, forty minutes per task drops to four. The shapes that fit are narrow and we name them: structured-document review, contract-field extraction, and lead triage off a fixed schema. We baseline the gain per engagement before we commit to a number. Not every workflow fits that shape, and we’ll tell you on the first call which of yours don’t. If we can’t measure the gain, you don’t pay for the build.
Stack
- n8n · queues
- Pydantic AI agents
- Webhooks · CRON
№03
Custom software & platforms
Custom software for the processes no off-the-shelf product fits. You own what we build.
When no product on the market matches how you actually work, or the one you have is painful to use, we build software shaped to your process instead of bending your process to a generic tool. For example, you keep the old ERP that nobody wants to open, and we put a clean front-end on top: your staff work in software built for them, while it reads and writes to the ERP underneath. You own what we build outright: no per-seat subscription, no vendor deciding your roadmap.
End-to-end product builds for clients whose problem doesn’t fit an off-the-shelf SaaS. Tenant model, billing, RBAC, audit log, admin console. The plumbing most teams only bolt on at Series A.
Stack
- Postgres RLS · per-tenant
- Stripe · automatic_tax
- RBAC + audit log
№04
Engineering & infrastructure
The backend, frontend and infrastructure the systems above run on, handed to your team at the end.
This is the foundation the three above run on (the servers, databases, and apps), built to hold up in real use, not just to demo. For example, the order data and the new front-end stay fast and online through your busiest ordering days, your data sits in the EU, and your own team can read and maintain the code after we leave. You get a working system handed over with its keys, not a black box you depend on us to touch.
Full-stack delivery: tRPC contracts with Zod validation, Postgres with row-level security, HMAC-verified webhooks, React web + React Native mobile. Cloudflare on the edge. First-byte budget: under 50ms across EU on warm-cache static and API routes; model-backed routes are gated separately at p95 < 600ms TTFT (see the stack). Data stays in EU regions (Supabase Frankfurt or your own racks).
Stack
- TypeScript
- Postgres
- Cloudflare Workers

№ 02 · Practice

The line of code is the easy part. Production AI is everything that surrounds it:

Retrieval grounded in your own data. Typed contracts that survive a schema change, evals that catch a silent faithfulness or refusal regression three sprints later and block the merge, a dashboard your DPO can read without our help, and on-call that pages the right person at 3am.

Principle 01

One lead architect owns the engagement end-to-end and writes the ADR behind every load-bearing decision. When scope demands a second head, senior team joins. They don’t take it over.

Artefact

ADR · /docs/adr/NN.md

Principle 02

Sensitive prompts run on local open-weight models (Mistral Small 3.2 / Qwen 3, Apache-2.0). Everything else routes to Claude (Opus 4.8 for reasoning, Sonnet 4.6 for throughput, Haiku 4.5 for routine routes) or the GPT-5 family through a policy you sign off on. Every call is logged with model, policy, retrieval set, and cost. The routing decision is auditable, not assumed.

Artefact

routing.policy.yaml · signed off in week 1

Principle 03

Your team owns the system the day we leave. That means ADRs in-repo, the eval harness wired into your CI, a registry with rollback documented and tested, a runbook written against actual incidents, and thirty days of paged on-call alongside you. Day 31 you have the pager.

Artefact

runbook.md · 30 days paged on-call

Live · review loop

Every system we ship goes through a review loop. This page went through it too.

Hit run and it scans the page you're reading. Each pass samples a different set of checks from the suite, and every one is already resolved in what you see.

Review complete · 4 of 54 checks

Performancethree.js · ~1.18 MB deferred
The hero's 3D scene (three.js, ~1.18 MB) loaded before the headline could paint.
ResolvedCode-split it. The text paints first; the sphere fades in after.
src/routes/index.tsx
```
const SceneBackdrop = lazy(() =>
  import("@/components/SceneBackdrop"),
);
// …
<Suspense fallback={null}>
  <SceneBackdrop />
</Suspense>
```
AccessibilityWAI-ARIA · aria-hidden
The architecture diagram is decorative, but a screen reader announced every node as content.
ResolvedMarked aria-hidden. The layer names live in the real text, not the geometry.
src/components/Architecture.tsx
```
<svg viewBox="-2.2 -2.2 4.4 4.4"
     className="w-full h-full"
     aria-hidden="true">
```
Motionprefers-reduced-motion
Reveal animations and the rotating sphere ignored prefers-reduced-motion.
ResolvedBoth honor it now. The page renders settled and the sphere holds still.
src/styles.css
```
@media (prefers-reduced-motion: reduce) {
  *, *::before, *::after {
    animation-duration: 0.01ms !important;
    transition-duration: 0.01ms !important;
  }
  .reveal { opacity: 1; transform: none; }
}
```
Layout regression328px shove · caught at 1920 · class guarded
A fix from an earlier review pass added a second position class to a rail that was already fixed. The cascade resolved against intent and shoved the hero 328 pixels down on wide screens. It shipped to production before an eyes-on check at 1920 caught it.
ResolvedRemoved the duplicate class. The audit now fails any class string carrying two position utilities, so the whole bug class is locked out.
scripts/check-static-hazards.mjs
```
// two position utilities in one
// class string: CSS source order
// decides, not intent (the 328px
// hero shove).
if (utils.size > 1) {
  findings.push({
    hazard:
      "duplicate-position-classes",
    file, line,
  });
}
```

This is the defect class most sites ship and never catch. The full suite runs on every build of this page; every fix below is real code in this repository.

№ 03 · Selected practice

Five projects.
Three we can show.

Same architect on every project, kick-off through cutover. Every build ships the same floor: typed contracts (tRPC + Zod), a 50–600-prompt eval set wired into CI (Promptfoo / Inspect AI), and OpenTelemetry traces from edge to model.

№ 01Live · Shipping in production

ElektrikPro

Field-service SaaS for the Slovak electrical trade · operated jointly with M.Z.CONNECT s.r.o., the practicing electrician's company · paying customers

Our own vertical SaaS for the Slovak electrical trade: job tracking, invoicing, parts ordering. Co-founded with a domain operator who lives the workflow, with paying customers. The architecture patterns are the same ones we run on client engagements; we ship what we sell.

TypeScriptPostgresStripeMobileCloudflare

Read the case study

№ 02Delivered · In production

B2B distribution operator · CEE

Custom sales & procurement software, with a sovereign local AI on top.

A deterministic engine handles quoting and procurement — best price across vendors, adjustable margins, stock ETAs, demand-based ordering. On top, a local AI grounded in the company's own data and running on their hardware: it surfaces trends, retunes the system in plain language, reports daily, and answers the team's questions. Cloud models touch only the non-sensitive drafting.

RAGLocal inferenceOn-premDeterministic coreClaude Sonnet

Read the case study

№ 03Internal R&D · Sebrona testbed

JARVIS · internal R&D

The architect’s own 24/7 private AI infrastructure

The founder’s own stack, running on his hardware: voice (whisper-large-v3-turbo on Groq, ElevenLabs streaming TTS) and chat across Mac, iPhone, and Telegram, hooked into calendar, mail, messages, journal, health, and whatever app is on his screen. 204 tools reachable via Model Context Protocol. Routing is hybrid by design: Claude Opus 4.8 for reasoning-heavy turns, local Mistral Small 3.2 in the office for prompts that can’t leave the network. It’s also where we prove architecture patterns before they ship to clients.

VoiceLocal-firstMCP24/7Founder-run

Read the case study

№ 04Delivered · NDA

EU public‑sector AI pilot

Private inference, default‑deny data, full audit trail

AI document workflow for a customer that cannot send data to US-hosted models. Hybrid retrieval over ~40k documents (BM25 + Voyage-3 dense, Cohere rerank-3 on top-50), then summarization on a quantized open-weight model on-prem. Non-sensitive prompts route to an EU-region hosted model through a policy layer the customer signed off on. The eval harness tests faithfulness, citation accuracy, and refusal correctness, built against a 600-item gold set with the customer’s own domain reviewers. Default-deny at every boundary; full audit trail streamed to their SIEM. Client name under NDA.

RETRIEVAL

Hybrid · 40k docs

EVAL SET

600 items · expert-reviewed

INFERENCE

Mistral / Qwen · Q4_K_M–Q8 · on-prem GPU

AUDIT

Default-deny · SIEM-streamed

Local inferenceDefault-denyEU-regionAudit

№ 05In build · 2026

European venue operator · pre-launch

Bookings, assets, reporting, built around how the venue works

Custom operations platform for an asset-heavy venue operator that doesn’t fit horizontal SaaS. Bookings, asset management, reporting. Modelled on how the team works today, not how Mews or Lightspeed would prefer they worked. Six-month pre-launch, three internal cohorts before public open. Client name under NDA.

PRE-LAUNCH

6 months · 3 internal cohorts

DOMAIN

Bookings · assets · reporting

BUILT FOR

Asset-heavy ops, not horizontal SaaS

OPEN

Q4 2026 · staged rollout

BookingsAsset mgmtMulti-tenantCustom

№ 04 · How an engagement runs

Five phases. By the end of week one, a fixed price.

Same order, every engagement. You commit to week one up front. By that Friday you have a fixed-scope proposal and a price. If the project doesn’t make sense as scoped, we say so and you owe nothing.

№ 01
Diagnostic week
5 days
On-site or remote. Read the codebase, the data, the team. End the week with a written architecture brief and a fixed-scope proposal.
№ 02
Architecture spec
1–2 weeks
Signed-off design: data model, inference routing, eval plan, integration contracts, on-call runbook. Reviewable by your team before any code ships.
№ 03
Build sprint
4–10 weeks
The build. We commit to your repo from day one, with weekly demos against the spec. The lead architect codes through to ship; senior engineers join when scope demands. No senior-to-junior handoff in week six.
№ 04
Eval & cutover
1–2 weeks
The eval harness runs against production data. Shadow traffic, then partial cutover, then full. Rollback path documented and tested before the switch.
№ 05
On-call handover
30 days, included
Paged on-call for the first month after cutover. Then a clean handover to your team: runbook, dashboards, escalation policy, post-mortem template.

№ 01
Diagnostic week
5 days
On-site or remote. Read the codebase, the data, the team. End the week with a written architecture brief and a fixed-scope proposal.
№ 02
Architecture spec
1–2 weeks
Signed-off design: data model, inference routing, eval plan, integration contracts, on-call runbook. Reviewable by your team before any code ships.
№ 03
Build sprint
4–10 weeks
The build. We commit to your repo from day one, with weekly demos against the spec. The lead architect codes through to ship; senior engineers join when scope demands. No senior-to-junior handoff in week six.
№ 04
Eval & cutover
1–2 weeks
The eval harness runs against production data. Shadow traffic, then partial cutover, then full. Rollback path documented and tested before the switch.
№ 05
On-call handover
30 days, included
Paged on-call for the first month after cutover. Then a clean handover to your team: runbook, dashboards, escalation policy, post-mortem template.

Anti-patterns

No senior-to-junior bait-and-switch.
No phase-6 ‘optimisation’ upsell.
No time-and-materials drift past the fixed-scope proposal.

Total

7–15weeksdiagnostic week through on-call handover

Contracted in EURfixed-scope or T&M

We don't publish typical ranges because every engagement starts with a fixed-price Diagnostic week. The only number that matters is the one in your proposal Friday of week one.

When Sebrona, when not

Sebrona is the right call when

Production AI inside an EU data boundary.
One architect end-to-end, not a six-person staff-aug rotation.
The system replaces a process you used to staff with people.

Sebrona is the wrong call when

You want a horizontal SaaS subscription. Buy Copilot or ChatGPT Enterprise.
You want twenty engineers in a war room. Call a Big 4.
You want a POC you'll throw away in six weeks. Don't waste either of our time.

№ 05 · Point of view

Sovereign by default

Your data must not leave your jurisdiction without your explicit permission.

Buyers across European mid-market and public sector have walked away from "send everything to a US server, trust us." They want systems that run where the data already lives: own racks, a chosen EU region, a private inference endpoint behind the firewall. We run vLLM on Linux/CUDA, llama.cpp or Ollama on Apple Silicon for smaller footprints, with quantization picked per workload (Q4_K_M for latency-bound routes, Q8 / FP16 when GPU budget allows). In every build we ship — included, not upsold — default-deny data policies, local-first inference for sensitive prompts, a full audit trail with model and routing policy logged per call, and EU residency end to end.

Residency stops

Dublin

CF EU

Frankfurt

Supabase

Bratislava

Office

On-prem

Your racks

Dublin
CF EU
Frankfurt
Supabase
Bratislava
Office
On-prem
Your racks

Data stays where it lives. We pick the region; we don’t move yours.

Compliance frameworks

GDPR

ART. 32 · SECURITY

AI ACT

ART. 9 · RISK MGMT

NIS2

ART. 21 · CYBER

GDPR
ART. 32 · SECURITY
AI ACT
ART. 9 · RISK MGMT
NIS2
ART. 21 · CYBER

Frameworks we map to in the architecture brief. Not badges, citations.

Read the termsDPA Privacy policy

Position 01

Local-first inference where it matters.

Sensitive prompts run on open-weight models on your hardware. The rest can route to Claude (Opus 4.8 / Sonnet 4.6 / Haiku 4.5) or the GPT-5 family. Your call, written into a policy you own. Every trace shows which model got which prompt and why.

Position 02

Audit on every call.

Prompts, retrievals, tool calls, and outputs all logged with model version, routing policy, data lineage, token count, and cost. Stream exports to your SIEM. Auditable by your DPO without our help.

Position 03

EU regions, or your regions.

Cloudflare with EU Data Boundary, Supabase Frankfurt, or your own hardware in your own racks. Sebrona s.r.o. (Bratislava) holds every contract; no US partner pays us a referral fee.

Trade-off we name out loud

Open-weight models still lag the frontier on hard reasoning and long-horizon planning. You pay for sovereignty in GPU capex, ops burden, and a worse answer on the hardest few percent of prompts. Where that gap matters, those routes go to a hosted model with explicit consent and a redaction pass. The spec names which ones. Sovereignty is the default. It is not the rule.

№ 06 · Reference architecture

Six layers. One stack that doesn't move when the model does.

In plain terms: one team owns all six layers. When a model or vendor changes, your system keeps running — no rebuild, no chasing fixes across suppliers.

Defensible defaults at every layer, each picked against a documented alternative in an ADR you can argue with. We build all six in-house. No subcontracted frontend, no off-shored data tier. Swap any cell for a tool your team already runs; the contracts above and below don’t move.

The stack is opinionated. If you already run a different orchestration framework or a non-Postgres data plane, we adopt yours and the spec records the swap. The evals, the model registry, and the OTel traces don’t change.

L5 ·Interface

↓Top → bottom

L0 ·Infra

L5
Interface
The visible surface: chat, dashboards, embeds, the widget your buyer recognises.
Web · React/TS
Mobile · React Native
Chat · copilot
Dashboards · embeds
React + TypeScript on the web, React Native for the mobile clients. The component primitives are shared across both. We don’t double-build the design system.
Spec
BUDGET · p95 < 50ms input → paint (warm shell)
TYPED · React + Zod props
FAILS · offline shell
L4
API Gateway
Auth, rate limits, request shaping. The seam between the public web and the model floor.
tRPC + Zod
REST · OpenAPI
Webhooks
Auth · RLS
tRPC + Zod is the default. One contract from DB row to React prop. We drop to REST when an external consumer needs OpenAPI; the validation discipline is the same either way.
Spec
BUDGET · p95 < 25ms
TYPED · tRPC + Zod
FAILS · jitter + retry
L3
Orchestration
Workflow graphs and agent loops. Retries, fallbacks, and the policies that catch a partial failure before the user notices.
LangGraph · MCP servers
LiteLLM router · policy YAML
Langfuse · Promptfoo
n8n · queues
LangGraph for stateful agent flows we run from spec to cutover; LiteLLM as the router behind the routing.policy.yaml signed off in week 1. Tool surfaces standardise on Model Context Protocol. Same server contract for Claude, local models, and the IDE. n8n behind a queue for analyst-facing automations the client will edit themselves. Langfuse for trace-level observability and prompt-version diffs; Promptfoo as the CI gate. Eval harness and prompt-and-model registry live in this layer, not as an afterthought.
Spec
BUDGET · p95 < 400ms per tool-call (excl. model)
TYPED · agent tool schema
FAILS · policy fallback
L3 → L4 · APPLICATION TIER
L2
Model Layer
Router across providers. Prompts under version control, an eval harness that runs on every push.
Claude Opus 4.8 · Sonnet 4.6 · Haiku 4.5
Mistral Small 3.2 · Qwen 3 (Apache 2.0)
Cohere rerank-3
Voyage-3 · BGE-M3
Routing policy decides which model sees which prompt. Hosted frontier when reasoning load is high and data is non-sensitive; local open-weight when sovereignty wins, at the cost of GPU capex and a real gap on the hardest prompts.
Spec
BUDGET · p95 < 600ms TTFT (5-min cache, Claude)
TYPED · routing.policy.yaml
FAILS · backup model route
L1
Data
Postgres, pgvector, object storage. The ingestion pipelines that feed every prompt and retrieval.
Postgres
pgvector
Object store
Event stream
One database until we prove we need two. pgvector keeps retrieval next to the row it grounds; we’ll move to a dedicated vector store only when scale forces it, and we’ll write the ADR explaining why.
Spec
BUDGET · p95 < 20ms (pg)
TYPED · SQL + pgvector
FAILS · read-replica failover
L0
Infra
EU data boundary by default. On-prem when the regulator requires it. Secrets in KMS, SLOs agreed week one.
Cloudflare · EU
Supabase · Frankfurt
OTel + SLOs
Vault · KMS
EU data boundary by default, on-prem when the regulator requires it. Secrets in a KMS, never in env files. SLOs and error budgets agreed in week one. What we won’t measure, we won’t bill for.
Spec
SLO · negotiated week 1
TYPED · IaC + OTel + KMS
FAILS · multi-region failover
FULL STACK
Six layers, one system
All six layers run in production at once. Type contracts hold across the seams; the eval harness gates every route.
Type contracts
Eval harness
OTel + audit log
EU sovereignty
On-prem option
Versioned · IaC
Every seam is a contract. When a regression slips in, the eval gates fire at the layer that broke it before it reaches the user.
Hover · light up all six layers
Spec
BUDGET · 6 layers · 1 build
TYPED · every seam · Zod + SQL
FAILS · contract first · loud

↑ data flows up · ↓ requests flow down · BUDGETs are eval-gate commitments, not historical SLOs. Measured numbers sit in each engagement’s ADR.

Read the canonical stack page→

Discipline

Discipline across every layer.

CONTRACTS

tRPC + Zod, generated

EVAL HARNESS

CI-gated, every route

DATA POLICY

Default-deny + egress allowlist

OBSERVABILITY

OTel traces + logs + metrics

№ 07 · Build baseline

Seven things in every shipped system.

Type-safe end to end. tRPC contracts, Zod validation at every boundary, generated clients; no untyped JSON across a network hop.
Eval harness on every prompt route, CI-gated via Promptfoo / Inspect AI on per-route gold sets sized to the blast radius (50–600 prompts): faithfulness, groundedness, refusal correctness, jailbreak resistance, cost ceiling, p95 latency. Pass-thresholds live in the route’s ADR (faithfulness ≥ 90 · groundedness ≥ 90 · refusal-correctness ≥ 95 · regression-detection p95 < 5%). A red build blocks merge.
Prompt and model registry pins every prompt and route to a version, with one-command rollback and A/B and shadow traffic out of the box.
OpenTelemetry traces from edge to model and back — a trace per user action, retrievals and tool calls as spans, token and cost as span attributes.
Default-deny data policies. PII redaction at the boundary, egress allowlist at the network edge, secrets in a KMS, never in env files.
ADRs in-repo for every architectural decision worth defending, so a new engineer can read them and understand why the system looks the way it does.
Cloudflare Pages preview to main in under six minutes, rollback is `git revert`, and database migrations are reversible or they don’t ship.

agent.ts · reference patternTypeScript

1// reference pattern (pseudocode, not a published SDK)
2import { agent, tool, eval } from '@sebrona/core'
3import { z } from 'zod'
4 
5export const triageAgent = agent({
6  model: 'claude-opus-4-8',
7  policy: 'default-deny',
8  retrieval: { store: 'pgvector', dense: 'voyage-3', lexical: 'bm25', rerank: 'rerank-3', topK: 8 },
9  tools: [searchDocs, openTicket, notifyOps],
10  guardrails: [pii.redact, eval.faithfulness(0.90)],
11  observability: { otel: true, traces: 'always' },
12})
13 
14// → typed end-to-end · eval'd on every prompt · observable by default

We build everything around the model.

№ 08 · Office

Miroslav Striško

Sebrona s.r.o.

IČO 57 639 272

Budatínska 3230/16A · 851 06 Petržalka

Bratislava · EST. 2026

LinkedIn →

The architect on the contract is the engineer in the commit log.

One architect. Two to three engagements a quarter. The reply lands inside twenty-four hours, and your repo is in your team’s hands by week one.

Seventeen years on the buying side of enterprise technology before he flipped to the build side. A decade in financial markets first: execution, risk, the operations end. Then seven years running senior export at a B2B technology distributor, EU coverage. In 2024 he started building the systems his last two careers had spent seventeen years buying. The first engagements predate the company itself: delivered before the 2026 incorporation and carried into Sebrona s.r.o.

Sebrona ships two of its own products. ElektrikPro: co-founded with a domain operator who runs the workflow daily, paying customers. JARVIS: the founder’s private 24/7 AI stack and the test rig where we prove an architecture pattern before it touches a client engagement. The senior team joins when the scope demands a second head. The full shipping log is at sebrona.com/changelog. Recent field notes at sebrona.com/blog.

Office · W21 2026

Refreshed 2026-05-24

Last shipped2026-06-19Practice · B2B-distribution case study published · sovereign local AI

Capacity · 12 weeks

NowBookedOpen

Next opening · Week 2 · for diagnostic week

On the desk

Venue operator · build sprint · week 5 of 8 · cohort-2 demo Friday.
Diagnostic week · CEE industrial group · written brief + fixed-price proposal Friday.
Architecture spec · sovereign RAG pilot · routing policy under legal review.

№ 09 · Contact

First call is free.

Bring something you want built, or something that’s stuck. The architect picks up. First thirty minutes are free; if there’s nothing worth building, the call ends with that conclusion in writing.

Bring to the first call

A system you want built, or one that’s stuck
The data it would touch: regions, volumes, sensitivity
Who on your side owns the outcome
A date that matters, if there is one

Write to the architect

First reply·Within 24 hours · from the architect · not a BDR

Engagement·Fixed-price Diagnostic week · then fixed-scope or T&M · contracted in EUR

Calling the model is one line.We build everything around it.

Four practices.One architect runs all of them.

Custom AI systems

Process automation

Custom software & platforms

Engineering & infrastructure

The line of code is the easy part. Production AI is everything that surrounds it:

Every system we ship goes through a review loop. This page went through it too.

Five projects.Three we can show.

ElektrikPro

B2B distribution operator · CEE

JARVIS · internal R&D

EU public‑sector AI pilot

European venue operator · pre-launch

Five phases. By the end of week one, a fixed price.

Diagnostic week

Architecture spec

Build sprint

Eval & cutover

On-call handover

Diagnostic week

Architecture spec

Build sprint

Eval & cutover

On-call handover

Sebrona is the right call when

Sebrona is the wrong call when

Sovereign by default

Local-first inference where it matters.

Audit on every call.

EU regions, or your regions.

Six layers. One stack that doesn't move when the model does.

Discipline across every layer.

Seven things in every shipped system.

The architect on the contract is the engineer in the commit log.

First call is free.

Four practices.
One architect runs all of them.

Five projects.
Three we can show.