hoakley July 3, 2026 Macs, Technology

Sort order, collation and the Finder

Have you noticed that the order in which items are listed in a Finder window is different from that used by the ls command in Terminal? For example, the Finder lists 20test.text before 200test.text, while ls lists them in reverse. You can see more differences between a longer listing of a test folder.

In the Finder, running with English (UK) as primary language and List sort order set to Universal:
01test.text 2test.text 02test.text 3test.text 20test.text 200test.text Atest.text åtest.text atest2.text átest2.text åtest2.text atest3.text atest12.text atest20.text btest.text

Using ls in Terminal that becomes:
01test.text 02test.text 200test.text 20test.text 2test.text 3test.text åtest.text Atest.text atest12.text átest2.text atest2.text åtest2.text atest20.text atest3.text btest.text

To understand the differences, I’ll consider the behaviours involved.

Numbers and ‘natural’ sort order

Perhaps the most obvious difference is how the two sort orders treat numbers. The Finder orders those according to their whole value, whether the numbers are at the start or end of the name. Thus in the Finder’s view 2 and 02 precede 3, as they’re less, and the highest number and last of that sequence is 200. The ls command simply compares them one digit at a time, so 0 precedes 2 regardless of what digits follow it.

As we often embed numbers in file names, this is important, and in the latter years of classic Mac OS became a contentious issue, with campaigners like Adam Engst and Stuart Cheshire encouraging Apple to adopt this ‘natural’ sort order in 1997. And it did, with full support for modern ordering available from OS X 10.6.

Case

File systems in Mac OS have traditionally been case-insensitive but case-preserving, unlike variants used by iOS. This means that files named Atest.text and atest.text cannot exist within the same directory, but wherever Atest.text goes it retains its name with an uppercase first character. Both orderings therefore disregard case, as is common practice. However, some schemes for sort ordering list uppercase before lowercase, and others do the reverse.

Accents and diacritics

Different languages, and sometimes even their regional variants, treat accents and diacritics according to different rules. Most commonly, for sorting purposes they are treated as having the same base character, but as shown here the order within that may differ. This is a complicated area, as illustrated by the Nordic letter Ø, which in Denmark and Norway is treated as distinct from O rather than an accented variant, and placed at the end of the alphabet after Z. If you’ve tried to look a term up in a Danish book, or use a Danish phone directory, you’ll know how confusing that proves.

Unicode normalisation

Some Unicode characters can be formed using more than one sequence of codepoints. For example, the accented character é can be represented as UTF-8 c3 a9 (Form C) or 65 cc 81 (Form D), although they’re identical in appearance. Although early versions of APFS for macOS ignored normalisation, it now normalises filenames just as HFS+ does. An initial normalisation step ensures the existence of two different forms doesn’t affect sort order.

Unicode Collation Standard and macOS

What was once so simple in ASCII has become a complex set of rules that vary by language, region and practice. These have been standardised for Unicode in its collation algorithm, used to determine sort order of strings of characters. The rules appear to be embedded in a set of binary files found in ~/Library/Metadata/CoreSpotlight.

The user has limited control over the sort order used in macOS. It must be the least-used feature in Language & Region settings, where it’s only offered when there are additional languages like French included in its list of Preferred Languages. Collations are included in Foundation’s Locale, and third-party code has access to the same collation as used by the Finder through Foundation’s localizedStandardCompare().

In the last century sorting and searching were early and major topics in learning programming and computer science. Thankfully that was long before they became so complex and dependent on collation rules.

12Comments

Add yours

1

joethewalrus on July 3, 2026 at 8:17 am
Reply

I developed the habit young of always using leading zeroes when numbering files, usually starting with 01, but 001 or 0001 if I thought it necessary. This has served me well, and I didn’t even realize macOS had changed to natural numbering until just a couple years ago. Kudos to Adam Engst and Stuart Cheshire for fighting the good fight.

LikeLiked by 2 people
- 2
  
  hoakley on July 3, 2026 at 11:10 am
  Reply
  
  Thank you, Joe. It’s always safest to use a coherent numbering system if order is important.
  Howard.
  
  LikeLiked by 1 person
3

eyelessjerry on July 3, 2026 at 11:42 am
Reply

Checked with BBEdit and it normally sort rows like the ls command, but I think neither that or the Apple (Finder) way makes most sense. – If you sort rows starting with numbers, you would normally want to have them ordered strictly by higher numbers below. In BBEdit I noted you can have this done in Sort Lines by checking ‘Numbers match by value’, which I had not thought much of before as I normally sort IP addresses and they are split up by 3 digits, but a few times I have tried to order long numbered lists and then this is welcome.

LikeLiked by 2 people
- 4
  
  eyelessjerry on July 3, 2026 at 4:24 pm
  Reply
  
  Hmm – seems I mistook the Apple sort order as it now looks right to me … anyway normally one can sort lines with preceeding numbers best in Numbers (but also BBEdit as it was).
  
  LikeLiked by 1 person
  - 5
    
    hoakley on July 3, 2026 at 7:21 pm
    Reply
    
    As a matter of interest, what sort order do you have in System Settings? Do you any option other than Universal?
    Howard.
    
    LikeLike
    - 6
      
      eyelessjerry on July 4, 2026 at 12:16 pm
      
      Sorry – was mostly reacting from memory regarding the sort order in the Finder, I think and might not have paid attention to your exact wording there … .
      
      Actually, reading the sentence again about “universal sort order” made me wonder if I know something at all about macOS … . I have never in my life seen or heard anything about this and cannot find anything about it anywhere (not even on the Internet). It might mean sort by Name (as I nearly always does, except where I agree on Apples default sort order of last added in Downloads and Donwloads and Trash are the only locations I normally open in list view – otherwise always Column view (unless I need to see the files creating/modified dates etc.). Or maybe it mean something else, but cannot figure out what you mean.
      
      LikeLiked by 1 person
    - 7
      
      hoakley on July 4, 2026 at 4:19 pm
      
      I’m sorry, but what I’m referring to here is simply the sorting order used for filenames. I thought that was clear.
      There are two Finder view types offering that, usually by default, and are widely used to order lists of files in order of their names: Column and List views. When sorted by name in either of those, the Finder uses the collation order I have described here, which is different from that used by ls, for example.
      Howard.
      
      LikeLike
    - 8
      
      eyelessjerry on July 4, 2026 at 12:54 pm
      
      Possibly you mean that no group setting or any other special setting is applied when looking at the files ordered by name. That is the best I can make out of it.
      
      LikeLiked by 1 person
9

John Gilbert on July 4, 2026 at 3:26 am
Reply

I am sure that maintaining POSIX compliance dictates the order used by ls and other terminal commands. And also the default order for the sort command which specifies “the collating sequence of the current locale”. These sort orders are all character by character.

Just the sort of single characters in file/folder names has caused confusion for me. Around 15 years ago, I assumed this was the same as the ASCII printable character order (as it was a generation ago when I could slowly read 7-hole paper tape). But it is not. All the non-alpha-numerics come before any numbers or letters and, in some cases, in a different oder.

I do make use of single leading non alpha-numerics to place selected sub-folders at the top in Finder. macOS (and a few apps) do the same with a leading dot or underscore as in .DS_Store, .sync or _Archive.

LikeLiked by 1 person
- 10
  
  hoakley on July 4, 2026 at 9:40 am
  Reply
  
  Thank you, John – you’re right of course, although Posix is inevitably flexible, in that too refers to collation by locale settings, as you say. What’s remarkable reading the Posix documentation is that it remains steadfastly pre-Unicode, and indeed is still written around ASCII. For example, there’s no mention I can see of normalisation, which is the first task of the Unicode algorithm regardless of locale. And all the characters referred to are drawn from the ASCII character set.
  Howard.
  
  LikeLike
11

peterkillick on July 5, 2026 at 9:51 am
Reply

Sorting in the Finder could perhaps be an area where Apple’s AI could be gainfully employed one day. Finder has no problem putting either Arabic or Roman numerals in their correct order, but with written numbers (one, two, three, four), it will put three before two and four at the start.

It is not difficult to think of examples where the ability to apply more human logic to this kind of exercise would be genuinely useful.

LikeLiked by 1 person
- 12
  
  hoakley on July 5, 2026 at 8:56 pm
  Reply
  
  Thank you. Although it’s an attractive idea, I think we need simple invariant rules, something that is anathema to AI, where each iteration is randomised and could lead to different behaviour.
  Howard.
  
  LikeLike