Building ai-hooks: Guardrails for AI Coding Agents
Building ai-hooks: Guardrails for AI Coding Agents
The Problem
The first generation of AI coding workflows was mostly about capability.
- Can the model read files?
- Can it run tests?
- Can it edit code?
- Can it open pull requests?
That is useful, but it is not the hard part anymore.
The real problem is this: an AI agent can be technically capable and still be operationally unsafe.
It can work on the wrong ticket. It can ignore a blocker that exists outside the repo. It can make a reasonable local change that violates a product decision made in Linear, Slack, or a planning doc. It can optimize for green tests while missing the actual business constraint.
That is the gap I built ai-hooks to close.
What ai-hooks Does
ai-hooks is a guardrail layer that sits around AI tooling. It does not replace the coding agent. It shapes the environment the agent works inside.
Think of it as middleware for autonomous development:
- A request comes in from an AI tool.
- ai-hooks identifies the project, user, task, and available context.
- It injects the right context before the agent acts.
- It evaluates the action against policies and boundaries.
- It records what happened so humans and other agents can respond.
The key idea is simple:
Do not just give the agent tools. Give it rules, context, and a feedback loop.
Why Prompt Instructions Are Not Enough
You can put a lot of good text into a system prompt. That still does not solve three real problems.
1. Prompts Are Static
The rules that matter are often dynamic:
- this issue is blocked
- the release is frozen
- the user only wants docs updated, not code
- a migration needs approval before deployment
- the PR is in a security-sensitive area
That kind of state changes constantly. It should not live only in a prompt.
2. Prompts Cannot Enforce Tool Boundaries
If an agent has access to file edits, git, cloud resources, and issue trackers, then some actions need real enforcement.
Examples:
- prevent production deploys from non-approved sessions
- require a linked work item before editing a billing service
- block issue closure if tests were not run
- stop the agent from changing code outside the agreed slice
That is policy, not prose.
3. Prompts Usually Lose the Feedback Loop
Humans need to know when the AI encountered friction:
- a decision conflict
- a missing requirement
- a risky code path
- a failing test that suggests a product bug instead of a code bug
If the agent sees that signal and no one else does, the system is incomplete.
The Three Guardrail Layers
I ended up modeling ai-hooks around three layers.
1. Context Guardrails
These make sure the agent has enough situational awareness before acting.
Examples:
- inject the linked issue, acceptance criteria, and recent decisions
- inject outcome-level context so the agent understands why the work matters
- surface active blockers, risks, and known constraints
- attach relevant docs and prior implementation patterns
This is the difference between "change the button" and "change the button without breaking the pricing experiment tied to this workflow."
2. Boundary Guardrails
These define what the agent is allowed to do.
Examples:
- read-only access on audit sessions
- no git push from exploratory sessions
- no database migration generation without explicit approval
- no edits outside the files matched to the work item
- no closing issues when required validations fail
This is where autonomy becomes usable. Not by removing power, but by making power conditional.
3. Feedback Guardrails
These capture what the agent observed and did.
Examples:
- log blocked actions with reason codes
- create an issue comment when the AI finds ambiguity
- emit a risk signal when an agent detects scope creep
- store successful patterns for reuse by future agents
Without this layer, you end up with invisible autonomy. That is exactly what teams should avoid.
The Hook Model
The cleanest design I found was a small set of predictable interception points.
type HookContext = {
projectId: string;
sessionId: string;
actor: 'human' | 'agent';
toolName?: string;
workItemId?: string;
repo?: string;
};
type HookResult = {
allow: boolean;
contextToInject?: string[];
warnings?: string[];
reason?: string;
};
interface AiHooks {
beforeSessionStart(ctx: HookContext): Promise<HookResult>;
beforePrompt(ctx: HookContext, prompt: string): Promise<HookResult>;
beforeToolCall(ctx: HookContext, input: unknown): Promise<HookResult>;
afterToolCall(ctx: HookContext, output: unknown): Promise<void>;
afterTaskComplete(ctx: HookContext, summary: string): Promise<void>;
}This gives you enough leverage without turning the platform into a maze.
A Real Example: Blocking the Wrong Work
One of the most common failure modes in AI-assisted development is silent priority drift.
The AI sees a nearby bug. It looks easy. It fixes it. The fix is good. The tests pass.
But it was not the work the user asked for.
That sounds minor until you multiply it across dozens of sessions.
Here is the kind of hook policy I wanted:
export async function beforeToolCall(ctx: HookContext) {
const workItem = await getWorkItem(ctx.workItemId);
const activeBlockers = await getActiveBlockers(ctx.projectId);
if (!workItem) {
return {
allow: false,
reason: 'No linked work item for writable action.',
};
}
if (activeBlockers.some(blocker => blocker.scope === workItem.scope)) {
return {
allow: false,
reason: 'Work item is currently blocked by an active project signal.',
};
}
return {
allow: true,
contextToInject: [
`Outcome: ${workItem.outcome}`,
`Acceptance criteria: ${workItem.acceptanceCriteria.join('; ')}`,
`Recent decisions: ${workItem.decisions.join('; ')}`,
],
};
}The important part is not the TypeScript. It is the behavior.
The agent gets stopped before it burns time in the wrong direction, and it gets the missing context when the work is valid.
ai-hooks and MCP
MCP gives AI tools a standardized way to connect to external tools and data. That is the transport layer. ai-hooks is the control layer that decides how that access should be used.
MCP answers:
- what tools exist
- what resources can be read
- what actions can be executed
ai-hooks answers:
- should this session see those tools right now
- what context should be injected before the tool is used
- what policy must pass before the action is allowed
- what follow-up signals should be emitted afterward
That combination is what makes AI tooling feel production-capable instead of demo-capable.
Product-Aware Guardrails
The most important design decision was that the guardrails should be product-aware, not just repo-aware.
Repo-aware guardrails can say:
- do not edit
billing/ - require tests before commit
- never push to
main
Product-aware guardrails can say:
- do not change onboarding copy while the experiment is still running
- do not resolve this ticket until compliance approved the workflow
- warn if the implementation conflicts with the decision made last week
- attach the top two business risks tied to this outcome before the task begins
This is the shift from "AI assistant" to "AI teammate with context."
Hard Lessons From Building It
1. Too Many Rules Feels Like Latency
Every hook adds cost.
If every tool call triggers six network round trips and a giant context assembly pipeline, the agent becomes slow and unpleasant. So the guardrails need to be opinionated and minimal.
My rule became:
intercept the smallest number of moments that give the highest control.
2. Warnings Are Not Enough for High-Risk Actions
I started with soft warnings for many operations. That was a mistake.
For some actions, warnings are fine.
- documentation changes
- low-risk refactors
- read-only analysis
For others, warnings are theater.
- deploys
- migrations
- cross-boundary edits
- issue state transitions
If an action is meaningfully risky, the hook needs to block it.
3. Teams Need an Audit Trail
Once agents become useful, people ask the same question every time:
"Why did it do that?"
So every blocked action, injected context bundle, and emitted signal needs a durable record.
Not because the AI is untrustworthy by default, but because operational systems need explainability.
A Practical Configuration Shape
I found it useful to keep configuration boring and explicit.
project: plannable
policies:
writable_sessions_require_work_item: true
block_deploys_without_approval: true
require_tests_before_issue_close: true
context:
inject_decisions: true
inject_active_risks: true
inject_acceptance_criteria: true
boundaries:
protected_paths:
- billing/**
- auth/**
allowed_tools:
- read_file
- edit_file
- run_tests
- create_pr
feedback:
report_blockers: true
create_risk_signals: true
store_success_patterns: trueThe system underneath can be sophisticated. The operating model should not be.
What Good Looks Like
When the system is working well, the developer barely notices the guardrails.
They notice that:
- the AI already understands the ticket
- risky actions require the right approvals
- blocked work is surfaced early
- decisions stop getting lost
- the AI leaves an audit trail humans can trust
That is the bar.
Not maximum autonomy.
Reliable autonomy.
The Bigger Takeaway
Most teams do not need a more powerful coding model first.
They need a better operating envelope around the models they already have.
That means:
- stronger context injection
- explicit action boundaries
- durable feedback loops
- product-aware policies instead of repo-only rules
That is what ai-hooks is for.
The future of AI development tooling is not just better code generation. It is better control systems around autonomous behavior.