Free PII Scanner

Privacy Scanner — Find & Redact PII in Any Text

Paste any text, click Scan, and every name, email, phone number, address, and account number is highlighted on the spot. Copy a redacted version — perfect for cleaning a prompt before pasting into ChatGPT, Claude, or Gemini. Runs 100% in your browser.

Your text0 / 50,000
Detected PII

Click Scan to highlight every piece of personal information in your text. The AI runs on your device.

About Privacy Scanner — Find & Redact PII in Any Text

If you've ever caught yourself about to paste a customer email, a contract draft, or a Slack export into ChatGPT and stopped to wonder "wait, should this leave my laptop?" — this tool exists for that exact moment. The Privacy Scanner reads through any block of text, highlights every personal data point it finds (names, email addresses, phone numbers, postal addresses, dates, account numbers, IDs, URLs, organizations, and secret-like strings), and gives you a one-click cleaned version you can paste into the LLM of your choice. The detection model and your text both stay on your device — there's no upload, no signup, and no log of what you scanned.

Under the hood, the scanner runs OpenAI's open-weight privacy-filter model — a multilingual XLM-RoBERTa fine-tune trained specifically to recognize PII — through transformers.js, accelerated by WebGPU when available and falling back to WebAssembly otherwise. The first scan downloads the q4-quantized model from Hugging Face's CDN (one-time, cached afterwards), and every subsequent scan starts instantly. Because the model is multilingual, it works on Spanish, Portuguese, German, French, Japanese, and many other languages — not just English text.

Three redaction modes cover the most common downstream uses. "Mask" replaces each entity with [REDACTED] — the safest default when you want a human reviewer to see that something was removed but not what. "Label" replaces values with their category — [PERSON], [EMAIL], [PHONE] — which keeps the structure intact so an LLM still understands the prompt's shape. "Remove" strips the entity entirely, useful for short snippets where you want the cleanest output. Pick whichever fits your workflow, then copy or download a .txt of the cleaned text.

The Privacy Scanner is the free-tier answer to "I'd just paste this into ChatGPT." It's not a replacement for a careful manual review on highly sensitive content — no PII model is perfect, and corner cases (initials-only names, partial addresses, unusual ID formats) can slip through. Use it as a first-pass scrubber that catches the obvious matches in a second, then read once before you send. For PDFs with sensitive content, our PDF Redact tool runs the same engine but applies findings as black redaction rectangles and rasterizes affected pages on save, so the underlying text is destroyed — not just visually covered.

Privacy Scanner — Find & Redact PII in Any Text — Frequently Asked Questions

How is this different from pasting text into ChatGPT?

When you paste sensitive text into ChatGPT, Claude, Gemini, or any other cloud LLM, that text leaves your device and ends up on a third-party server — and depending on your settings and tier, it may be retained for training. The Privacy Scanner runs the detection model entirely in your browser. The text never leaves your device, there is no signup or account, and no server has any record of what you scanned. The whole point of the tool is to scrub the prompt before it leaves you.

What kinds of PII does it detect?

Names of people (PERSON), email addresses (EMAIL), phone numbers (PHONE), postal/street addresses (ADDRESS), dates (DATE), account numbers including IBAN/credit card style (ACCOUNT), identification numbers like passport/SSN (ID), URLs, organization names (ORG), and secret-like patterns such as passwords or API keys (SECRET). The underlying model is OpenAI's privacy-filter, trained specifically for this task — it errs on the side of recall, so review the highlights and copy whichever cleaned version fits your downstream use.

Does it work in languages other than English?

Yes. The privacy-filter model is multilingual (built on XLM-RoBERTa) and identifies PII across many languages. Quality is best on Latin-alphabet languages — Spanish, Portuguese, German, French, Italian, Dutch — and weaker on highly inflected or non-Latin scripts. Japanese, Chinese, and Arabic work but with lower recall. If you scan a non-English document and find the model missed something, fall back to the redaction modes (Label is safest) and review manually.

Is the scanning really private?

Yes. The model is downloaded once from Hugging Face's public CDN and cached by your browser. From that point on, every scan runs entirely on your device — no text or scan results are ever sent to FormatFuse, OpenAI, Google, or any other server. You can verify this by opening your browser's Network tab while you scan: after the first-time model download, there are zero outbound requests. We have no server that could log your text even if we wanted to.

What should I do with the redacted output?

Pick the redaction mode that fits your downstream use. "Mask" replaces each entity with [REDACTED] — the safest default when a human reviewer needs to see that something was removed. "Label" replaces it with [PERSON]/[EMAIL]/etc., which is best when an LLM still needs to know the structural role of what used to be there. "Remove" strips the entity entirely. Always do a final manual reread before sending — no model is perfect, and adding context (like "the customer mentioned in the [REDACTED] ticket") sometimes leaks information indirectly.

Why is the first scan slow?

On first use the tool downloads the privacy-filter model — about 290 MB at q4 quantization, served from Hugging Face's CDN. Your browser caches it after the download, so every subsequent scan starts instantly (typically under a second for a few thousand characters of text). If your network is slow, the progress bar in the Scan button shows the download percentage. The download happens entirely between you and Hugging Face's CDN — FormatFuse never sees the request.

Are there any limits on how much I can scan?

There's a 50,000 character per-scan input limit, mostly so very long inputs don't hang the browser. For most use cases — emails, support tickets, contract clauses, chat exports, code comments, CSV rows — that's plenty. For longer documents, split them into chunks and scan one at a time. There's no per-day quota, no signup, no account, and no usage cap on the number of scans — the tool runs on your device, so we have no costs to pass on.

What about PDFs and other documents?

For PDFs, use our PDF Redact tool — it runs the same on-device privacy-filter engine but applies the findings as black redaction rectangles and rasterizes affected pages on save, so the underlying text is destroyed (not just visually covered). For images of text, use our Image to Text (OCR) tool to extract the text, then paste it here. For Word documents and plain .txt files, copy the contents into the textarea above.