Yesterday, I pointed out some of the pitfalls of trying to redact documents and prepare them for release to others, or publication. I wrote that PDF is one of the few formats which is well supported, and is of course a popular medium for many. This article works through preparing a document for publication using Adobe Acrobat Pro DC. Adobe also provides details in its online user guide.
Adobe only provides its redaction tools in paid-for versions of Acrobat, and those described here are part of Adobe Acrobat Pro DC. There are similar tools available in some other advanced PDF editors too.
Like all the more useful tools in Acrobat, you have to open the Tools view in order to find and engage them. Select the Redact tool, and add it to your document view.
This adds a toolbar with three main actions: mark and redact, remove hidden information, and sanitize the document. Make a point of using them each in that order.
Redact obliterates selected sections of text and/or graphics, normally using filled black rectangles, just like a black marker. Removing hidden information tackles those elements which you can’t see to redact, but which Acrobat detects. Sanitising handles all the other things which could trip you up, like metadata and odd bits of data hanging around in the document file.
If you’re content with the standard black marker approach to redaction, that is the default. If you need to add tags to explain why the redaction has taken place, select the Properties tool and you can set that up in this dialog. You can add your own codes to cater for non-US purposes.
The first time that you use the Mark for Redaction tool, Acrobat explains in this dialog how the process works. First, you work through the document marking up the areas for redaction. Because applying redaction removes all trace of the selected text/graphics, you should then check those intentions before making them permanent. This seems clumsy, but is actually the best way to do this robustly.
By default, intended redactions are shown in red boxes until you make them permanent.
Then the underlying content is removed, leaving just a filled black box to show where the content was.
Whenever you apply redactions to make them permanent, Acrobat kindly offers to proceed to the next step, of locating and removing hidden information. This is another good piece of design.
Once hidden information has been located, it is shown by type in a sidebar to the left. My version of Acrobat Pro DC (up to date) consistently quit unexpectedly whenever I tried to show the preview of the document metadata, though.
Viewing hidden text is also excellent: the PDF I was using had two sections of hidden text, one concealed behind an image, and the other in text coloured white inserted between paragraphs. Although not impossible to detect manually, they would normally be overlooked, and released in the published document. But Acrobat found both: this is the text hidden behind the image.
This is the text displayed in white.
The final step is to strip everything else from the document which could contain information which shouldn’t be released. When you click on the Sanitize Document tool, this is the list provided. Note that this even includes macOS extended attributes, although they are not explicitly listed here, such as the ‘quarantine’ flag.
My biggest concern with Acrobat’s redaction features is that they are almost impossible to audit. There was a time, long ago, when the text content of PDF files was stored as plain text within the PDF file, and you could open a PDF and check it using a text editor such as BBEdit. When Acrobat saved the first fully redacted and sanitised version of my test document, its size doubled, from 581 KB to 994 KB, which seems very worrying. I therefore recommend that you complete redacting and sanitising, then duplicate the document produced by that. Open the duplicate in Acrobat, sanitise it a second time, and save. In my text case, the final document size fell to 507 KB, which seems much more reasonable.
If you need to supply or publish redacted documents, and PDF is a suitable format for them, Acrobat Pro DC looks like a thoroughly reliable platform with which to prepare them.