Adversarial robustness posture

Built for content engineered to evade.

AEGIS is designed assuming the input is hostile. Continuous red-team, founder credibility from operating at adversarial scale, and a transparent methodology that does not reveal the weights.

Continuous red-team program

AEGIS is exercised continuously against content engineered to evade it. The red-team program operates against the production engine and against staging. New evasion patterns become test cases that ride alongside the regression set; surfaced patterns trigger detector updates.

The program runs internally and against external partners under research-license agreements that allow disclosure to AEGIS without unrestricted publication of evasion techniques.

Founder credibility

Eric MacDougall, FrameBright's CTO, led platform engineering on a 45M+ MAU consumer surface where content classification operated continuously under adversarial pressure for years. The lessons informing AEGIS's design come directly from that experience.

Signal inversion: what a model labels safe is what attackers label opportunity.
Encoding tricks: homoglyphs, zero-width characters, base-N obfuscation, layered encodings; each becomes a detector update class.
Cross-modal evasion: the most reliable evasion class hides in the seams between modalities; AEGIS has a detector dedicated to it.
Context collapse: benign-in-context content shoved into a hostile context; the contextual detector exists for this.

Three patterns named on the homepage

The homepage names three live evasion patterns AEGIS is tuned for. Each has a corresponding test corpus that is exercised against every model update.

Encoded media. Content where the unsafe payload is carried inside benign-looking media (steganography, encoded text in imagery, payloads inside audio).
Cross-modal evasion. Content where each individual channel reads benign and the combination is unsafe.
Context-collapse attacks. Content that is benign-in-source-context and unsafe-in-delivery-context.

Public methodology

AEGIS's methodology is public. The categories of detector, the ensemble approach, the evidence model, the adversarial test corpora classes; all documented. Specific detector weights and corpus contents are not published; they are exactly the surface attackers would target.

Methodology papers live in /research. {TBD-papers}

How the engine updates

When a new evasion class emerges (in the wild, in the red-team program, in a partner disclosure), the response runs in days, not foundation-model release cycles. The relevant detector receives a tuned update; the ensemble is re-trained on the updated corpus; a release candidate is rolled to staging, exercised, then deployed.

Every deployed engine is versioned. A consumer can pin a version, can request the changelog of detector and ensemble changes between two versions, and can audit a held piece of content against a specific version.

Transparency posture

The principle: publish the methodology, not the weights. Methodology lets researchers and partners reason about what AEGIS does and why. Weights would be the exact surface an attacker exploits. The line is drawn deliberately.