PDF without Adobe: 23 The nightmare of forms

One of the most useful extensions to the original concepts behind Adobe’s Portable Document Format, PDF, is the PDF form. First introduced as an extension to annotations, many government and corporate services now offer PDF forms as an alternative to paper, and in some cases these have become the only practical way to file tax returns, apply for residence permits, and do other essentials. The bad news, though, is that PDF forms are a minefield, and even the most sophisticated of third-party products is guaranteed to be incompatible with some form formats.

PDF forms come in two main types: AcroForms, which are governed by open standards, and Adobe XFA Forms, which are proprietary. Worse still, within those two broad types there are multiple variants.

AcroForms

AcroForms were the original format, and first appeared in PDF version 1.2. They’re essentially an extension of annotation features which allow submitting, resetting and importing form data, for which there are four variants:

  • HTML Form format is based on HTML which can now extend to HTML 4.01;
  • Forms Data Format, FDF, has an internal structure similar to that of PDF, and is based on a dictionary of objects;
  • XML Forms Data Format, XFDF, is a limited implementation of FDF using XML format, and is itself available in several different versions;
  • PDF, in which the entire document is submitted as the added content becomes part of that document.

Values which are entered into a form can be kept in the PDF document itself, or stored in an external file using FDF or XFDF format.

AcroForms supports a range of controls and entry devices such as text boxes and buttons, and allows the use of JavaScript which can enable sections of the form depending on the user’s responses, for example. Support for JavaScript varies between different PDF apps. The built-in support in macOS, in its PDFKit and Quartz 2D engine, doesn’t include deep support for AcroForms, which are left to third-party apps to implement, although controls are normally rendered correctly and most function fairly well.

Adobe XFA Forms

Adobe introduced a proprietary form system with its XML Forms Architecture (XFA) back in PDF version 1.5. Although this is now officially deprecated in the international PDF 2.0 open standard, they are still commonly encountered. In practice they’re only supported by Adobe products and by third-parties which have licensed XFA from Adobe.

To the best of my knowledge, built-in support in macOS, in PDFKit and Quartz 2D, doesn’t include any support for XFA Forms.

Digital signatures

PDF forms may also require signing using digital signatures, rather than a digital image of a handwritten signature. Methods for issuing and using digital signatures vary according to their vendor, and must be specifically supported by a PDF app for them to be usable. Unfortunately, non-Adobe apps running on macOS either have no or very limited support by those offering digital signatures, and where they can be used is with Adobe PDF products alone.

Let the nightmare begin

I’m very grateful to Robert, who kindly pointed me at the US Department of Homeland Security’s USCIS Form I-130, a petition for an alien relative, as a fine example of how PDF forms can get really messy.

pdfforms01

Even opened in my free Podofyllin, this form looks as if it’s all going to work out fine, and it uses AcroForms rather than XFA Forms, it would appear, so in theory we should be onto a winner without having to complete it using Adobe Acrobat.

pdfforms02

pdfforms03

As you start to complete the form, though, it becomes apparent that boxes which should unlock according to your entered responses, such as that to contain an apartment number, simply don’t work. This is almost certainly because they rely on JavaScript which isn’t functioning.

pdfforms04

It’s only when you look inside the JavaScript embedded within the form that you discover that it’s supposed to perform a version check, both on the version of Adobe Acrobat being used, which must be 9.0 or later, and the version of XFA supported by it, which is 2.8 or later. So this form actually uses the proprietary Adobe XFA Form architecture, but because other PDF readers can’t run its JavaScript, they can’t even make the user aware of this fact.

I’m also unsure why this form even requires the features of this now-deprecated and proprietary architecture to enable text boxes which would only ever be completed when one of the preceding checkboxes was selected. I suspect this form could have been designed using widely-supported features of AcroForms, so that it could have been completed by a range of PDF software. Instead, its designers have opted for a proprietary format.

And that is so often the underlying problem of PDF forms.