When you run a website like this blog, you get used to visitors remarking how they had never realised that your site existed, and how they wish they had discovered it some time ago. You also get used to visitors asking questions which are answered in full in other articles on the site. As a long-time Q&A columnist, I am well used to readers asking me questions which have already been answered in many articles on the internet.
There are two interrelated problems: first, that search engines don’t work for many (or even most) people, and that when they do chance upon a helpful site, finding what they want is still far from straightforward.
We all encounter this every day. Just this last week, when I was preparing an article on SIP, I wanted to know the recent change to its command line interface which can enable SIP without entering Recovery mode. I remembered that it had been covered by several sites (including this blog) when the change was introduced late last year. But all the hits that I could muster were old articles telling the same old simplified story from before. None had been updated in the light of that change, and their search engine ranking was primarily being driven by popularity rather than quality or relevance of content.
That’s the elephant in the room for search engines: quality of content. When you’re looking for information, it doesn’t matter whether the site is big for advertising revenue, or if that page has been viewed a million times. What matters is whether the information is there, and whether it’s accurate.
Whatever else might get factored in to search rankings, we know that quality and accuracy of content are not, and we know that search engines do not ‘learn’ in any AI way to improve the accuracy of their searches on the basis of quality or accuracy of content. In other words, search algorithms are indifferent to the main objectives of many/most searches: they are at best just an incredibly elaborate blind man’s buff/bluff.
For many topics, Wikipedia is a good way forward, and its popularity is a measure of the failure of search engines. Despite its own flaws, it remains the most reliable way of obtaining fairly accurate information on many subjects. But it is of little practical value when it comes to diagnosing and managing computer problems.
In the case of diagnosing and managing computer problems, as in so many areas, our existing web model falls short of what users require. If you already know most of the answer, navigating through a sea of disjoint articles is perhaps not too bad. Starting from scratch and trying to assemble those into a coherent understanding is a major challenge, which is often beyond the time and resources of users. What you need is an integrated account, something like a book.
The snag with most electronic publishing formats is that they are little more than books in browsers, and are extraordinarily limited and inflexible. They do have basic HTML-type links, can incorporate an impressive range of different types of content, and allow you to place bookmarks, notes, etc.
But spend a little time with proper hypertext in apps like Storyspace and Tinderbox and you’ll quickly see what I mean: a range of different types of link, text substitution, different graphical ways of visualising document structure, and much more. Of course it’s possible to emulate some of their function using scripts and stylesheets, and there are hypertext authoring environments which are starting to do that. They’re acceptable where content has to be published online, but for offline use prove a pale imitation.
This all leads me to the conclusion that, whilst I should continue to write articles for this blog, I should also assemble, edit, and adapt them into integrated accounts for offline use. I have already started this experiment last week, as I described here in Moving a blog to Tinderbox: Troubleshooting Macs.
How this should ultimately be delivered is an interesting question that I have already been asked. It is easy to be misled by drawing false parallels. For example, I mentioned Ian Page’s invaluable Mactracker, a standalone app which delivers complete, systematic and detailed information about all Apple’s hardware and operating system products.
The most basic Tinderbox/Storyspace documents are far more complex and sophisticated, making the Storyspace Reader app (which is free) the minimum entry point. Use anything less than that, and you quickly fall back to the sad compromises that have been made in electronic publishing. These are fine for pumping out popular ‘page-turner’ novels in electronic format, but impoverished for highly interlinked offline technical documentation (and much else).
If we’re going to make real progress in electronic publishing, we need to break free from the book-in-a-browser model.