PDF without Adobe: 1 At the heart of macOS

I’ve been using Portable Document Format (PDF) for over twenty-five years now, and despite my initial misgivings, it has established itself as one of the most important document formats. PDF came out of PostScript, the original page description language for laser printers which was developed by the founders of Adobe, including John Warnock and Charles Geschke, during the early 1980s.

When PDF was introduced in 1993, Macs already had two vital graphics languages: QuickDraw and PostScript. QuickDraw was for the display, and had some amazing advanced features. You may still bump into it now, in its PICT format graphics files. PostScript was for high-end printers, particularly Apple’s LaserWriter which enabled the Mac to lead the ‘desktop publishing revolution’.

Adobe Acrobat PDF was but one contender among several for a universal document format, which would enable anyone to distribute electronic copies of documents with arbitrary content. To create a PDF document in those days, you had to print to a PostScript file, then use Adobe’s Distiller app to convert that into PDF.

With Macs sticking to its proprietary QuickDraw, when the NeXT computer was developed, its designers opted for Display PostScript as the centrepiece of its graphics. At the time, many thought this to be a mistake, as PostScript wasn’t as efficient a graphics language as QuickDraw, as it had been designed to render pages in slower time in print engines.

When NeXT and Mac merged to form the beginnings of Mac OS X, Display PostScript was replaced with PDF as the central graphics standard for both display and printing, in what was dubbed Quartz 2D – which lives on today in macOS.

PDF is based on PostScript, but has proved superior in both performance and cost. Adobe built much of its business on the licence fees for PostScript, but PDF was standardised in an open format (ISO 32000) and is thus free to use.

macOS has several key document formats, including Rich Text (RTF, used throughout styled text content via TextKit), HTML (in WebKit), plain text of course, XML (particularly in Property Lists), and PDF. The latter is supported by PDFKit, which is part of the Quartz graphics system, used for display and printing.

Like PostScript before it, and unlike Rich Text and HTML, PDF is not designed to be interactively edited – which is one of its strengths. I have innumerable PDF documents on this iMac going back to the very first release of PDF in 1993, all of which have been unchanged since they were first created. Many are reference manuals, legal documents, and other material which you wouldn’t want to be tampered with.

There are occasions, though, when you need to add to or change a PDF document, and the market for PDF forms has flourished in recent years. In the UK, they form a major part of the national tax filing system, for instance.

From its first release, I have been using Adobe Acrobat Pro as my ‘serious’ tool for working with PDFs. Originally Adobe Distiller, it was the only way to turn the Mac’s PostScript output into PDF documents, then became part of the benchmark suite of tools supporting PDF creation and editing. Sadly, this year I have finally abandoned Adobe’s products as they have priced themselves out of their own market, wanting around £500 for a ‘perpetual’ licence, or more than £25 per month on a rental basis.

I still complete my tax forms using Adobe’s free Acrobat Reader DC, but wanted a high-end PDF editor for occasional use, and a low-end reader for just quickly accessing PDFs every day. I think that I have found a suitable replacement for editing in PDFpenPro 10, and will report on it later.

As a reader, neither Preview nor Adobe Acrobat Reader DC work how I want them to. The latter, in particular, has become cluttered and clunky to use. I just want to browse PDFs and extract content, such as source code as text, and the occasional image. I’d also like to be able to copy out lengthier passages sometimes – simply and without any fuss.

So I’m experimenting with PDFKit in another free tool, Podofyllin. To save you reaching for Wikipedia, podophyllin is a pretty toxic resin which is extracted from the roots of the mandrake plant (a favourite in alchemy and witchcraft) and is sometimes used to remove warts. It also happens to contain the letters P, D and F in order.

podofyllin03

Here’s a first beta release of Podofyllin. Each document window has four views: at the left, a thumbnail view, then the main PDF view, the extracted plain text, and the righthand view is currently unused but will contain the tree-structure view in a future release.

It is based on macOS PDFKit, so can’t change the documents it opens in any way. You can select and copy content from the PDF view, which is sort of Rich Text, and of course from the plain text view. You can also reorder the pages using their thumbnails, but any changes won’t be saved to the original PDF document.

Podofyllin version 1.0b1 is available from here: podofyllin10b1
and from Downloads above.

When writing this, I was surprised to discover how few accounts there are of using PDFKit in Swift. As I have written less than twenty lines of code, I will work through creating a similar app in later articles in this series.