Three open-source web apps demonstrate OpenAI's Privacy Filter for PII detection and redaction
Hugging Face engineers built three reference applications using OpenAI's newly released Privacy Filter model and Gradio Server, showing how to detect and redact personally-identifiable information in documents, images, and text pastes at scale.
1 source · cross-referenced
- OpenAI released Privacy Filter, a 1.5B-parameter Apache 2.0-licensed model that detects eight categories of personally-identifiable information (PII) in a single 128k-token context window and achieves state-of-the-art performance on the PII-Masking-300k benchmark.
- Hugging Face engineers built three production applications demonstrating the model: a document privacy explorer that highlights PII spans in PDFs and DOCX files, an image anonymizer that redacts PII detected via OCR with draggable canvas overlays, and a pastebin tool that generates dual URLs for public redacted and private unredacted views.
- All three apps use Gradio Server, a FastAPI-based backend framework that provides queuing, GPU allocation, and unified endpoint handling for both browser and programmatic clients, eliminating the need to duplicate business logic across different interface layers.
OpenAI released Privacy Filter this week as an open-source PII detector, now available on Hugging Face's model hub. The model is 1.5 billion parameters with 50 million active parameters and is licensed permissively under Apache 2.0. It identifies personally-identifiable information across eight categories—private person, address, email, phone number, URL, date, account number, and secret—in a single forward pass over a 128,000-token context window. According to Hugging Face, the model achieves state-of-the-art performance on the PII-Masking-300k benchmark.
Hugging Face engineers demonstrated three distinct applications built around Privacy Filter and Gradio Server, a backend framework for pairing custom frontends with queued inference. The Document Privacy Explorer accepts PDF or DOCX uploads, extracts text, runs it through Privacy Filter in a single pass, and renders the document in a styled HTML reader with detected PII spans highlighted by category and client-side filterable toggles. Because the full document processes in one 128k-context window, text offsets map cleanly to rendered positions without chunking artifacts.
The Image Anonymizer accepts screenshots or images, applies optical character recognition to extract bounding boxes for each word, reconstructs full text with a character-to-box mapping, runs Privacy Filter over the reconstructed text, and returns pixel-aligned rectangles for detected PII. The frontend renders these as draggable black bars on a canvas, allowing users to manually adjust positions, add new bars, and toggle entire categories on and off. Image export happens client-side without server round-trips.
SmartRedact Paste is a pastebin tool that applies Privacy Filter to submitted text, replacing detected PII spans with category placeholders (e.g., <PRIVATE_EMAIL>), and generates two URLs: a public one serving the redacted version and a token-gated private URL showing the original with highlighted spans. The app routes both as FastAPI endpoints within Gradio Server, demonstrating how queued model endpoints and plain HTTP routes can coexist in a single application. All three apps show how Gradio Server's unified queueing and client libraries eliminate code duplication across browser-based and programmatic access patterns.
- Apr 28, 2026 · Hugging Face / NVIDIA
NVIDIA Nemotron 3 Nano Omni adds audio and video capabilities to multimodal AI model
Trust63 - Apr 28, 2026 · GitHub · huggingface/transformers releases
Hugging Face releases Transformers v5.7.0 with new model support and bug fixes
Trust79 - Apr 28, 2026 · TechCrunch — AI
OpenAI and Microsoft renegotiate deal, resolving exclusivity dispute with Amazon partnership
Trust54