Docs / PHI/PII Redaction
PHI/PII Redaction
Scrub names, emails, phone numbers, and dates of birth from your agent execution logs in one click — with a HIPAA-friendly audit trail of every redaction run.
Overview
PHI/PII Redaction lets workspace OWNERs and ADMINs replace sensitive values in an agent's historical execution data with the literal word REDACTED. It targets the JSON payloads stored in executions (input, output, error) and execution_logs (message, structured data), walking the JSON structure and only touching string values so shape, keys, and types are preserved.
This feature is built for healthcare practices, legal teams, and any workflow that processes regulated personal data. Every redaction run writes one row to a dedicated audit table so you can answer — in writing — who redacted what, when, and how much changed.
Availability
PHI/PII Redaction is available exclusively on the Enterprise tier. The action is also gated by workspace role: only OWNER and ADMIN can redact — BUILDER and VIEWER members never see the control.
What gets redacted
You choose any combination of pattern-based and literal-value redactions for each run.
- Email addresses — standard regex pattern catches anything in the form
name@domain.tld. - Phone numbers — permissive international pattern that catches numbers with optional country-code prefixes (US
+1, UK+44, AU+61, and others), plus generic 8–20 digit formats with parentheses, dashes, dots, and spaces. - Dates of birth —
DD/MM/YYYY,MM/DD/YYYY,DD-MM-YYYY,DD.MM.YYYY, and ISOYYYY-MM-DDformats. - Literal values — an exact-match list you supply, one entry per line. Use this for patient names and any other free-text strings (account numbers, internal IDs, file names) that regex can't reliably identify without false positives. Minimum 3 characters per entry, max 50 entries per run.
What gets skipped
Only executions with status COMPLETED are redacted. Failed, running, and pending executions are intentionally skipped.
This rule matters for middleware-style workflows that pass data between systems and get auto-retried after a failure. The retry runs against the original input, so PHI has to stay intact until the run actually succeeds — otherwise the next attempt would have nothing real to forward. After the retry succeeds, the next redaction run picks up the now-completed execution and scrubs it.
Three ways to run it
Open the Privacy & Compliance section on any agent detail page and pick the mode that fits your workflow.
- Manual — on-demand button. Pick toggles, optionally enter literal values (patient names, etc.), click Redact. Use this for ad-hoc cleanup or to scrub specific names from a known roster. Only mode that supports literal values.
- Scheduled — cron expression that fires a regex-only sweep of the agent's completed executions. Standard 5-field cron (e.g.
0 2 * * *for daily at 2 AM) with IANA timezone (e.g.America/New_York). Use this for nightly scrubs after a retention window. - After every run — the worker scrubs each execution immediately after it completes successfully. Failed runs aren't touched so middleware-style retries still have their original input. Use this when you want PHI to live in Falcon for the absolute minimum amount of time.
Scheduled and post-execution modes don't support literal values — there's no patient roster stored on the agent. Run a manual sweep periodically if you need to scrub known names.
Audit trail
Every redaction call writes one row to the redaction_logs table, capturing:
- Workspace and agent the run targeted
- The user who initiated it (id and email)
- Which redaction modes ran (emails, phones, DOB) and the literal-value count
- How many executions were scanned vs. redacted
- Aggregated row counts —
executionsupdated andexecution_logsupdated — stored as JSONB - Timestamp
The literal values themselves are never stored in the audit table; only the count of values is recorded. This keeps the audit trail itself free of the data you just scrubbed.
Idempotency & safety
- Idempotent — re-running a redaction with the same patterns is a no-op. Once a value has been replaced with
REDACTED, further runs find nothing to match. - Structure-preserving — redaction walks JSON in memory, touching only string values. Object keys, array indices, numbers, booleans, and nulls are never modified, so JSONB columns stay valid and your downstream tooling that reads execution payloads keeps working.
- Per-execution error isolation — if a single execution fails to update for any reason, the rest of the batch continues. Failed executions are reported back in the response so you can investigate.
- Irreversible — redacted values are gone. There is no undo. If you need to keep originals for a regulatory window before scrubbing, run the redaction at the end of that window rather than continuously.