Policies That Build Themselves: OPA/Rego Behavioral Baselines in INTERCEPT

The standard OPA/Rego tutorial walks you through writing a policy that checks whether a user has the right role before allowing an action. It’s a reasonable introduction. It also misses the most interesting use of the policy engine: letting the traffic tell you what the policy should be.

INTERCEPT takes the second approach. Here’s how it works and why the distinction matters.

What INTERCEPT Is

INTERCEPT is a passive network sensor — a transparent proxy that sits inline on the network without requiring any changes to the services it observes. It intercepts HTTP/HTTPS traffic, runs it through a 24-rule security scanner, maps the complete API surface of every service, and enforces behavioral policies via OPA.

The eBPF/Tetragon layer handles kernel-level process attribution: every TCP connection is tagged with the process that originated it. When the Ollama API gets called, INTERCEPT knows which process made the call, not just which IP. When a credential appears in a request header, it knows which application sent it.

The policy engine is the part that makes INTERCEPT more than a passive observer.

The Problem with Hand-Written API Security Policies

Traditional API security tools give you a policy language and expect you to write rules. Block requests from this IP range. Require this header. Reject payloads larger than N bytes. Rate-limit this endpoint.

The problem: you can only write rules for behavior you’ve already thought about. The actual attack surface of a production API is discovered at runtime — undocumented endpoints, unexpected content types, parameter combinations that nobody tested, authentication flows that were added by one team and forgotten by another.

Writing policies before you’ve observed the traffic means you’re guessing. You might catch the obvious cases. You’ll miss the long tail.

Baseline Learning: Let Traffic Define the Policy

INTERCEPT’s approach: observe first, enforce second.

During the learning phase, INTERCEPT runs in passive mode — it logs everything but blocks nothing. Every request generates a flow record capturing process identity, endpoint path and method, authentication state and scheme, content type, payload size, response status, and timestamp. That’s the full behavioral fingerprint of a single API call.

Over time, these records build a model of what normal looks like for each endpoint: which processes call it, what content types they send, what the payload size distribution looks like at the p99, whether authentication is consistently present.

That model becomes the policy.

Rego Policy Generation

OPA’s policy language has a property that makes this approach work: policies are data. A Rego policy is a set of rules expressed as logic, but the inputs those rules evaluate against can be derived programmatically rather than written by hand.

INTERCEPT generates Rego policies from the observed baseline. For a given endpoint, the generated policy encodes the allowed caller set (which processes have legitimately called this endpoint), the expected content types, authentication requirements, and payload size thresholds derived from the observed p99 distribution with appropriate headroom. Each rule produces a structured denial message when it fires — not just a boolean, but an explanation.

The threshold values and allowed sets come from the baseline data, not from configuration files someone edited. INTERCEPT generates the policy; the operator reviews and approves it. The result is a human-readable expression of what the system already does — which is a meaningfully different thing from a policy someone tried to predict in advance.

AI Governance: Why Behavioral Baselines Matter

The AI governance use case is where this approach shows its real value.

Traditional security tools have no concept of an “AI service call.” They see HTTP requests. They can’t distinguish between a developer querying an internal Ollama endpoint for a legitimate workflow and an application quietly exfiltrating data to an external LLM API.

INTERCEPT’s behavioral baseline captures the full picture of AI service usage:

Shadow AI detection: If a new process starts calling an external AI endpoint, it’s not in the baseline. The policy fires. You know immediately — at the network level, with process attribution — that something is calling an AI service that nobody approved.

PII-in-prompt detection: INTERCEPT’s scanner rules include a PII detector that runs on request payloads. The baseline tells it which endpoints are expected to receive PII and which aren’t. An endpoint that has never received a Social Security number pattern suddenly doing so is an anomaly, not just a match.

Token exfiltration signals: AI inference typically has roughly correlated input/output token volumes. A request with a 500-token prompt producing a 15,000-token completion is anomalous. That pattern is a signal — either something is using the model as a relay, or an adversarial prompt is causing unexpected output. The behavioral baseline makes the anomaly detectable without needing to define the attack in advance.

OPA Decision Logging

Every OPA policy evaluation produces a structured decision log entry. Each record contains the policy path evaluated, the full input that triggered evaluation, the deny messages that fired (including which rule and why), and a timestamp. INTERCEPT ships these to a log store.

This log is the audit trail. When you’re asked “which applications accessed the AI service last Tuesday and what did they send?” — that’s a query against decision logs, not a network packet capture. The policy evaluation records answer the compliance question directly.

The Practical Difference from Firewall Rules

A firewall rule blocks or allows based on source IP, destination port, and protocol. It has no concept of process identity, payload content, behavioral history, or policy reasoning.

What INTERCEPT’s OPA integration adds:

Process identity — via eBPF attribution, not IP. Containers can share IPs. Processes don’t share PIDs.
Payload awareness — the policy sees what was sent, not just that a connection was made
Behavioral context — the policy knows whether this is normal for this endpoint based on history
Explicit reasoning — every deny includes a message explaining why

When something gets blocked, the log tells you which process, calling which endpoint, with what payload, and which policy rule fired. That’s a qualitatively different debugging experience than “packet dropped by rule 47.”

Where This Goes

The next phase of INTERCEPT’s policy engine is automated policy drift detection: when observed traffic diverges significantly from the baseline (new callers, new endpoints, shifted payload distributions), INTERCEPT generates a diff showing what changed and proposes an updated policy for review.

The operator doesn’t write the policy. The operator approves or rejects what the system observed.

That’s the goal: policy as code derived from behavior, not behavior required to match policy someone wrote from memory.

INTERCEPT is a self-hosted API intelligence and AI governance platform. It runs transparently on SOVEREIGN infrastructure. Architecture writeup and demo available on request. Currently open to Staff/Principal Architect roles where this kind of thinking is the baseline, not the exception. imsre.dev or LinkedIn.