Anthril
← All frontiers
Frontier · active

Applied AI for Businesses

AI becomes genuinely useful to a business when it is embedded in a real workflow, handles the structured majority of decisions autonomously, and routes the nuanced minority to humans with proper context.

What we ruled out

Hallucination-tolerant consumer appsGenerative ad-techRAG over toy corporaOperating infrastructure on the customer's behalfConsulting engagements that produce decks instead of shipped software

Frontier 2 is where Anthril’s research becomes commercial. Every product here is designed to deliver measurable business value — faster decisions, lower operating costs, reduced manual overhead — while also validating specific Aurora research components in real production conditions.

How it works

Frontier 2 products use existing frontier LLMs as the reasoning engine. What we layer on top are Aurora-derived backend systems: event schemas for representing domain workflows, episodic memory for institutional knowledge, and structured handoff protocols for human-AI decision routing.

The operating model: route the high-volume structured portion of domain decisions to AI; route the nuanced high-stakes portion to human operators with the context they need to act decisively. The AI clears the path. The human makes the calls that matter.

The governance layer

VGuard enforces runtime policy at the tool-call level — before mutations reach production, not after. As coding agents and business automation agents become standard infrastructure, the question is not whether to govern them but where. Post-hoc audit is too slow. Pre-deployment testing misses runtime context. The tool-call boundary is the primary control point.

Open questions

What we are still working on.

Q01

Do event schemas outperform token-stream prompting for domain decisions when using the same underlying LLM?

H2.1 tests this. If typed event schemas consistently reduce procedural errors and hallucinations in production — with the LLM held constant — the representation advantage is confirmed independently of model architecture. This validates Aurora's event-first hypothesis in a live product context.

Investigated by: vguard

Q02

Does runtime governance at the tool-call level outperform pre-deployment testing and post-hoc audit?

H2.3. VGuard enforces policy at the tool-call boundary before mutations reach production. The question is whether this control point reduces unrecoverable errors by a greater margin than upstream testing alone — and what that implies for how Aurora-based agents will eventually need to be governed.

Investigated by: vguard

Q03

Do memory-augmented business tools outperform stateless RAG on institutional knowledge tasks?

H2.2 tests whether Aurora-inspired multi-tier memory — episodic logs, semantic domain rules, procedural workflow patterns — outperforms stateless retrieval on tasks requiring exception recall, longitudinal context, and stale-knowledge detection. Testing ongoing.

Investigated by: claude-plugins

Currently open

VGuard is in Week 1–6 foundation phase. Monorepo, CLI trust model, and policy bundle foundation are in place. Early access is open for teams using Claude Code in production.