Before Mac OS X, Disk First Aid used to be a separate app that you could run on HFS+ volumes or partitions, which are the same thing for that file system, to verify or repair them.
Disk Utility, with its integral First Aid feature, is one of very few utilities that has survived relatively unchanged since the first public beta-release of OS X. One significant change has been the merging of verify and repair into the single First Aid button. This article looks at two contentious issues in First Aid: whether you need to run it on APFS volumes and containers, and how to work round the status 65 bug that can prevent First Aid from working.
For the purposes of this article, I’ll consider the following disk structure:
- At the top level, disk4 is divided into two partitions, disk4s1 and disk4s2.
- disk4s1 is a hidden EFI partition in MSDOS format.
- disk4s2 is an APFS container, disk5.
- disk5 contains one or more APFS volumes disk5s1, disk5s2, and so on.
Volume First Aid
Select an APFS volume and click First Aid and Disk Utility runs a command like
fsck_apfs -y -x /dev/rdisk5s1
on that selected volume alone. The -y
option tells fsck_apfs
to repair any errors that it can, if it detects any, and the undocumented -x
option runs this through its XPC interface to the Disk Utility app. Note that the volume is specified using its raw device name of rdisk5s1
, and not the block device disk5s1
. As mentioned below, this overcomes any inconsistencies that could arise with caches.
Unless you use its -S
option, fsck_apfs
will check every snapshot of that volume that it finds, which can take a long time. There’s no option in Disk Utility to skip a full snapshot check, though.
When you want to check and repair a specific volume, the best option is to select that volume and run First Aid on it. If that returns any warnings or errors, then follow that by running First Aid on its container too.
Container First Aid
Select an APFS container and click First Aid and Disk Utility runs a command like
fsck_apfs -y -x /dev/disk4s2
on that selected container, using the same options as for a volume. This doesn’t need to refer to the raw device, but Disk Utility does refer to the container using its ‘partition’ name rather than as disk5
.
When fsck_apfs
checks and repairs a container, it first works on container structures, then iterates through checking and repairing each volume within that container. That includes all hidden volumes, which you can’t select in Disk Utility, and with those volumes all their snapshots too. Checks and repairs made during this are the equivalent of performing
fsck_apfs -y -x /dev/rdisk5s1
then
fsck_apfs -y -x /dev/rdisk5s2
and so on until all the container’s volumes have been checked.
When you want to check the health of all the volumes in a container, the best option is to select that container and run First Aid on it. As that checks and repairs all volumes in that container, there seems no point in checking them individually unless you suspect them of having problems, in which case you should do that before checking the whole container.
Disk First Aid
Select a disk and click First Aid and Disk Utility doesn’t run fsck_apfs
at all, but checks the disk partition map. It doesn’t check any partitions or containers beyond that.
As Apple recommends, this can be left until after you have checked containers.
fsck_apfs
As Disk Utility reveals the commands that it uses, if you’re happy using the command line in Terminal, you can perform equivalent checks there if you prefer. This enables you more flexibility, with the command options
-n
instead of-y
, so that errors and warnings are reported but not repaired;-S
so that the check doesn’t iterate through each snapshot, although it still checks them at top level;-o
to repair any overallocation; however you must be careful to ensure that you don’t use this from an olderfsck_apfs
on a newer file system, or data loss and disaster could result.
When used on volumes, ensure that you refer to the raw device, such as /dev/rdisk5s1
, to avoid inconsistencies. For containers, it’s worth following Apple’s practice and referring to the partition such as /dev/disk4s2
rather than its container equivalent /dev/disk5
.
Status 65
APFS keeps a count of all volumes mounted within each container, nx_num_vols_mounted
, which is incremented each time a volume is mounted, and decremented each time a volume is unmounted. APFS will only unload a container when nx_num_vols_mounted
is zero, i.e. all its volumes have been unmounted.
When snapshots of volumes are mounted, those also increment nx_num_vols_mounted
, but they aren’t automatically unmounted when preparing to perform an fsck_apfs
. Thus, if there are 70 snapshots in a Time Machine backup, when APFS has unmounted all the other volumes, nx_num_vols_mounted
will fall to 70, but no attempt is made to unmount those snapshots. With nx_num_vols_mounted
stuck at 70, APFS is unable to unload the container, and fsck_apfs
then returns a status of 65.
I have previously described how you can manually unmount Time Machine backup snapshots; a far simpler method in Disk Utility is simply to select the disk and unmount that, which now appears reliable. Once the disk has unmounted, select the volume or container you wish to check and repair, then click on First Aid.
Summary
- If you suspect problems with a specific volume, select it and click First Aid. If that returns any warnings or errors, select its container and run First Aid on that too.
- If you want to check all volumes within a container, select the container and click First Aid. There seems little point in starting by checking each volume first.
- Run First Aid on the disk last of all, when you’re happy that its partitions/containers are healthy.
fsck_apfs
has more options if you’re happy with the command line and observe the advice above.- If any First Aid or
fsck_apfs
run returns a status 65, unmount that disk and try again.