I hadn’t noticed it before Big Sur, but Apple’s bundled Preview app can now redact PDF documents. This article looks at how thorough that is, and whether you can rely on it removing all access to sensitive content in your PDFs.
The PDF format was never designed for tasks such as redaction of content. A typical PDF document consists of dozens or even thousands of objects containing snippets of text, each normally compressed and assembled in a complex tree. Not only that, but they often contain objects from previous edits, some of which may still contain text which is invisible to the user. Ensuring that all those objects have been properly redacted, and none leaks anything that could cause trouble, is an almost impossible task.
In short, no one should ever choose to develop a redaction tool for PDFs, as it’s a course to almost inevitable failure. And any user whose sensitive content has been left accessible in a redacted PDF will never forgive you.
How does it work?
Preview’s Redact tool is simple to use. Open the document you want to redact, select the Redact command in the Tools menu, and outline the text and graphics which you wish to redact. If you just want to select text, you can use the text selection cursor; otherwise draw a box around the area and it will turn black. Once you’ve finished, visit the Tools menu and uncheck the command before you wipe anything else out: this isn’t a single-shot command, but a tool state.
When you first select the command, unless you’ve already disabled it, you’ll be warned that what you’re about to redact will be removed permanently.
Once text has been selected for redaction, it’s shown in black overprint with Xes, which is reassuring.
More troubling, perhaps, is that when you place the pointer over the area of redaction, you can still see the text which has been “removed permanently”.
Save the file, though, and when it’s opened it does appear to have been fully redacted apart from a few fragments of letters at the right edge.
Does it work?
However, my redaction test file contains a secret: the visible text actually extends further to the right, off the edge of the page. That invisible text is easily recovered from the PDF using Adobe Acrobat CC. So has Preview removed that, or was it too fooled into leaving what it couldn’t see on the page?
When examined in Acrobat’s Sanitize tool, it was able to recover the visible fragments of letters, and the whole of the hidden text.
Preview’s redaction tool actually does appear to do quite a good job, as far as it can see text in a PDF document. The visible letter fragments are a bit worrying, but its inability to redact the invisible text casts doubt on its use for anything in the slightest bit serious.
I have written this before and will repeat it here: no redaction tool can be worth using unless the same app can scavenge PDFs for hidden content in the way that Acrobat’s Sanitize tool does. Redact is only part of the solution – sanitize is the more difficult and more essential companion. For the moment, Adobe Acrobat CC remains the only tool capable of reliable redaction of PDFs, and even there I have my doubts.