Benchmarking AEGIS against public content-safety datasets
An ongoing disclosure of AEGIS performance against publicly available content-safety datasets. Reports structural results (which datasets, which detector configurations) without publishing specific precision and recall figures that would aid evasion. Methodology is fully described; raw results available under research-license to qualified academic partners.
Abstract
An ongoing disclosure of AEGIS performance against publicly available content-safety datasets. Reports structural results (which datasets, which detector configurations) without publishing specific precision and recall figures that would aid evasion. Methodology is fully described; raw results available under research-license to qualified academic partners.
Status
Research note · full PDF pending. This page is the canonical abstract for now. The complete paper publishes once we finalize external review and distribution; this page links to it on the same URL when ready. Subscribe for release alerts via contact · research interest.