unorml is a command tool which applies any of the four standard Unicode normalisation forms to strings. This is most valuable when you want to convert filenames or paths from one normalisation to another. For example, if you supply a string containing the word café in form C (or any other)
unorml -d 'café'
returns that string normalised to form D. Options are -c, -d, -kc and -kd to support each of the four standard forms.
This tool is supplied with its own source code, which you are welcome to adapt, use, or whatever else in your own code – this is really simple, but amazingly frustrating when you can’t find any way of performing normalisation.
If you want to explore Unicode normalisation more extensively, then I commend my free GUI utility Apfelstrudel, which not only shows all normalised forms, but also performs standard string operations on them to show what is safe, and what will break with incorrect normalisation.
unorml is now available in version 4, which is a Universal binary, and runs native on all versions of macOS from Sierra to Big Sur betas, and on both Intel and Apple Silicon Macs. That’s particularly important with command tools, as the last thing that you want on your shiny new Apple Silicon system is a call to any command to have to wait for Rosetta 2 to translate an Intel binary to run on ARM processors.
I’ve already explained how mixing Intel-only apps and tools can pose problems: it’s far better on an Apple Silicon system to run a complete calling chain using a single architecture where you can. This update now makes this possible.
unorml version 4 is fully notarized, supplied with a convenient Installer package, its source code, and full documentation, here: unorml4
from Downloads above, and from its Product Page, where you will also find Apfelstrudel.
I hope that it proves useful to you. I will continue to leave its Intel-only version available from its Product Page, in the event that you encounter any problems with this new version.