Explainer: defragmentation

Storage is only half a concept, its essential companion being retrieval.

Consider a large library containing tens or even hundreds of thousands of books. Major reference works are often published in a series of volumes. When you need to consult several consecutive volumes of such a work, how they’re stored is critical to the task. If someone has tucked each volume away in a different location within the stack, assembling those you need is going to take a long while. If all its volumes are kept in sequence on a single shelf, that’s far quicker. That’s why fragmentation of data has been so important in computer storage.

Cast your mind back a few years, and one of the essential software utilities every serious Mac user had and used was something to defragment (defrag) their hard disks. Even if you didn’t believe in the importance of defragging your files, you knew that you should defrag free space, and that rising fragmentation could seriously impair your Mac’s performance. Who now spends time every few weeks defragging anything?

The story of defragging on the Mac is perhaps best illustrated in the rise and fall of Coriolis Systems and iDefrag. Coriolis was started in 2004, initially to develop a tool for non-destructive re-partitioning of HFS+ disks, but its founder Alastair Houghton was soon offering iDefrag, which became a popular defragging tool. This proved profitable until SSDs became more widespread and Apple released its APFS file system in High Sierra, forcing Coriolis shut down in 2019, when defragging Macs effectively ceased.

All storage media, including memory, SSDs and rotating hard disks, can develop fragmentation, but most serious attention has been paid to the problem on hard disks. This is because of their electro-mechanical mechanism for seeking to locations on the spinning platter they use for storage. To read a fragmented file sequentially, the read-write head has to keep physically moving to new positions, which takes time and contributes to ageing of the mechanism and failure. Although solid-state media can have slight overhead accessing disparate storage blocks sequentially, this isn’t thought significant and attempts to address that invariably have greater disadvantages.

Fragmentation on hard disks comes in three quite distinct forms: file data across most of the storage, file system metadata, and free space. Different strategies and products have been used to tackle each of those, with varying degrees of success. While few doubt the performance benefits achieved immediately after defragging each of those, little attention has been paid to demonstrating more lasting benefits, which remain dubious.

Manually defragging HFS+ hard disks was always a questionable activity, as Apple added background defragmentation to Mac OS X 10.2, released two years before Coriolis was even founded. By El Capitan and Sierra that built-in defragging was highly effective, and the need for manual defragging had almost certainly become a popular myth. Few considered the adverse effects on hard disk longevity of those intense periods of disk activity either.

The arrival of APFS in High Sierra brought fragmentation back to the fore as, being designed primarily for solid-state storage, it cares not about fragmentation. Users quickly reported poor and deteriorating performance when using Apple’s new file system on rotating hard disks. It took the painstaking work of Mike Bombich and others to demonstrate the root of the problem: while data fragmentation isn’t insignificant, the most serious impact on performance is the result of severe fragmentation in file system metadata.

Conventional wisdom is that fragmentation on SSDs is unimportant, and that defragging them is not just unnecessary but harmful, but neither of those claims is as accurate as often thought. Retrieving heavily fragmented files from SSDs does take slightly longer, and there are times when defragging may be necessary to avoid system limits in the number of file fragments that can be supported. This has been the case for Windows, which (at least in 2014) still performed a monthly ‘intelligent’ defrag of SSDs used to store its snapshots. That doesn’t appear to be of any value for SSDs with APFS, though.

There is no doubt that the intensive defrag routines many users performed in the past with hard disks are gone for good, particularly if you’ve been able to switch over completely to SSDs. APFS is perfectly usable on hard disks used primarily to store files without repeatedly changing them, such as APFS Time Machine backups and archives. But the few users condemned to boot macOS versions requiring APFS from a hard disk will vouch that will never be from choice.

Was defragging as important as some made out, or just a widespread hoax? We’ll never know, but I’m only too delighted that it has now gone for good.