More fun scripting with Swift and Xcode: opening docs and converting text

It’s another ‘it seemed a good idea at the time’ project: Rosettavert, a GUI wrapper for the command tool iconv.

If you haven’t met either mojibake or iconv, you have lived a cloistered and Roman-scripted life. Until the widespread adoption of Unicode, most languages which use non-Roman scripts and many older operating systems used all manner of character-encoding systems. They should all have died a long time ago, but there are an awful lot of documents on the internet which still don’t play Unicode.

A few years ago, I was dipping into the Georgian language, which uses non-Roman scripts. I stumbled upon hordes of free documents, most in PDF or text format, which used a text encoding system which I didn’t recognise. I bought a couple of text encoding conversion tools, but still there were some encodings which wouldn’t convert to proper Georgian characters. The App Store offers a few utilities, but they look quite old and tired, and I suspect are rapidly becoming incompatible with macOS.

Mojibake is a marvellous Japanese term which describes those frustrating times that you open a document only to find it turned into gibberish because of its arcane encoding. Now that retro-computing is becoming popular, more users are being confronted with text encodings from the past – including all the Mac, Windows, and other schemes for handling Arabic, Cyrillic, Japanese, and more.

Thankfully macOS provides a neat text encoding converter in iconv, a command tool which can convert between 144 different formats, many of which have several synonyms too. Although it’s not too demanding to use in Terminal, it has some quirks and most users would benefit from a friendly wrapper: a classic scripting challenge.

Because the selection of encoding is often a matter of trial and error, I want Rosettavert’s document window to have two scrolling text views: on the left, to display the original file, and on the right, the converted form. With so many options supported by iconv, they may not be suitable for a popup menu, but in the initial stages I’ll work with a small subset arranged in two – one for the encoding used by the input file, the other to be applied to the output.

This isn’t a particularly good fit with Apple’s concept of a document-based app, in which each window contains just one document. In Rosettavert, each window contains two documents, one of which mustn’t change, and the other which is generated from that.

My previous document-based apps have only saved their contents, and not opened them. Working from MacAppScaffold, it was fairly quick to change its window to the form that I wanted.

rosetta1

For development purposes, I have left an extra text box under the buttons to view the command being generated (as in Consolation), and the Check button merely assembles the command for display in that box.

Writing file output uses similar code to my previous apps, in the data() function of NSDocument:
override func data(ofType typeName: String) throws -> Data {
if let theVC = self.windowControllers[0].contentViewController as? ViewController {
let fileContentToWrite = theVC.textScrollContent2.string!
return fileContentToWrite.data(using: String.Encoding.utf8) ?? Data()
}
else {
return Data()
}
}

This simply gets the ViewController for the window’s view, retrieves the text content from the scrolling text view on the right (the converted one), and returns it in UTF-8 form.

Reading a text file into the left-hand scrolling text view has to be performed in two parts, as it is initiated before the document window is open. The function to read the text is quite simple:
override func read(from data: Data, ofType typeName: String) throws {
if let s = String(data: data, encoding: String.Encoding.utf8) {
string = s
thePath = (fileURL?.path)!
}
else {
throw NSError(domain: NSOSStatusErrorDomain, code: unimpErr, userInfo: nil)
}
}

where string is an NSDocument variable, as is thePath.

With the file contents ready, when the window controllers are created, they can be inserted into the attributed text for the scrolling view:
override func makeWindowControllers() {
let storyboard = NSStoryboard(name: "Main", bundle: nil)
let windowController = storyboard.instantiateController(withIdentifier: "Document Window Controller") as! NSWindowController
self.addWindowController(windowController)
if let theVC = self.windowControllers[0].contentViewController as? ViewController {
let attr = NSAttributedString(string: string)
theVC.textScrollContent1.textStorage?.setAttributedString(attr)
theVC.theInPath = thePath
}
}

Trying to insert the string read from the file into the text scroller any earlier will result in problems. This makes the neat formatted source for NSDocument:

rosetta2

When you click on the Convert button, the code first checks that it has a file path for the input file, then assembles the arguments for calling iconv. If that returns without error, and the standard output buffer is not empty, it then stores the converted output in the scrolling text view. This is implemented in the formatted code below.

rosetta3

iconv can work from file contents stored in standard input, or directly from the file. Although it is not as efficient, requiring the file to be read afresh for each conversion, at this stage I prefer to work from the file, so that there is no risk of the app’s handling of its contents altering the data in any way.

Because each window effectively represents two documents, one which is only read from disk and the other which is only written to disk, I have disabled and hidden the Save menu command; that would over-write the input document with the output, which is something only the user should be able to choose to do.

The next phase is to work on how to offer the 144 different encodings. I’ll see how unmanageable they make popup menus before looking at alternatives.