Capabilities

Redacting Images with OCR

The Tool

~/apps/cc/recipes/redact-image.py takes a screenshot and one or more strings, finds each string with OCR, and blurs or black-bars the pixels behind it. Output writes to input.redacted.png next to the original unless -o is passed.

python3 ~/apps/cc/recipes/redact-image.py input.png "ojhurst@gmail.com"

Multi-word strings work. Multiple strings in one call work. The script reports any strings that did not match so nothing ships by accident.

Arguments

input — image path (positional)
-o / --output — output path (default: input.redacted.ext)
--mode blur|black — default blur; black draws a flat rectangle
--radius N — Gaussian blur radius (default 10)
--pad N — pixels of padding around each match (default 4)
trailing positional args — one or more strings to redact

Example with black bars:

python3 ~/apps/cc/recipes/redact-image.py shot.png --mode black "472 S Loafer View Dr" "All Things Handy"

How It Works

Pillow opens the image.
pytesseract.image_to_data runs Tesseract and returns bounding boxes for every word it detected.
For each target string, the script walks the OCR token stream and matches multi-word phrases by normalizing out punctuation and casing.
For each match, it crops the region, applies a Gaussian blur (or a black fill), and pastes it back.
Saves to the output path and exits non-zero if any target string was not found in OCR.

No network calls. No image leaves the machine. No paid APIs. Matches the super-CLAUDE rule against paid API usage for automation.

Dependencies

tesseract — brew install tesseract
pytesseract — pip3 install pytesseract --break-system-packages
Pillow — typically already present

When To Use It

Publishing a screenshot that shows an email, phone, address, or business name
Scrubbing a CRM contact ID out of a screen recording frame
Any image headed to a public blog post, social, or external support ticket

Pair with the Screenshot Triage flow: grab the screenshot, identify the strings to blur, redact, commit the clean copy to the target repo.

Failure Modes

OCR miss on stylized fonts or low-contrast text — Tesseract sometimes will not find the text at all. The script prints the missed strings to stderr. Fix: upscale the image before feeding it in, or fall back to --mode black with manual coordinates.
Partial word match on substring-only OCR — the matcher uses in rather than equality to tolerate OCR noise, which can occasionally over-match. If you see extra redactions, tighten by passing the full phrase instead of a single word.

Chrome Extension SOP — the “never send James hunting for logs” sibling principle: if Claude can do it in a tool, the tool owns it
Screenshot storage: ~/apps/cc/screenshots/ (auto-capture by the screenshot daemon)