Name that Folder/File

Different file systems have different requirements for the naming of files and folders. When you want to move files between different systems, these can trip you up.

The first computer file systems were primitive when it came to naming of files. For years PC users were trapped in MS-DOS’s (FAT16, 1987) restriction of eight characters, a dot, and a three character extension, whilst Mac users employed deliberately florid names flaunting the features of its Hierarchical File System (HFS, 1985).

A lot has changed since then, with recent release of Windows sporting the NT file system (NTFS, 1993), and several other systems being adopted for Unix and Linux. Most, though, have distinct limitations in the characters that can be used in file and folder names: Windows implementations of NTFS, for example, allow 255 Unicode characters including upper and lower case, but excludes ‘special’ characters ?, “, /, \, , *, |, and :, although most of those are permissible on a Mac.

Case

File systems can be case-preserving, and/or case-sensitive. The difference between these may appear small, but is of great practical importance. Preserving case means that if you save a file with the name MyName.text, it will always be known as MyName.text, not MYNAME.TEXT or any other variation. Case sensitivity means that MyName.text is recognised as being different to MYNAME.TEXT and other variations. Normally, a file system that is case-sensitive also preserves case, although there are some old implementations that did not.

The default OS X file system, Macintosh or Mac OS Extended (HFS+), preserves case but is not sensitive to it. If you create a file named MyName.text in one folder and try to copy another file named MYNAME.TEXT into that folder, HFS+ will consider that the two files have identical names, and warn you that MyName.text already exists, asking whether you want to replace it with MYNAME.TEXT. This occurs even though it always returns the two files’ names in the way that you created them, with their original capitalisation. For the great majority of Mac users, this is sufficient, and case-sensitivity brings no advantages.

In recent versions of OS X, you also have the option of using a case-sensitive version of HFS+, set when you initialise a volume using Disk Utility. Under this, you can put MyName.text and MYNAME.TEXT in the same folder, and the file system will recognise them as different files with different names, every bit as different as YourName.text is.

Your Mac can access other file systems which behave differently to HFS+ in either of its current variants. Older implementations of the MS-DOS file system FAT were neither case-sensitive nor case-preserving: MyName.text could not exist, but would have been MYNAME.TXT (noting its primitive requirement for three-character extensions too). Modern Windows files systems such as FAT32, VFAT, and NTFS normally behave like regular HFS+, being case-preserving but not case-sensitive. However when NTFS is accessed through a POSIX subsystem, it acquires case sensitivity, which has been known to cause chaos.

Case handling

Delving still deeper into this, case handling depends on the two types of file system lookup, directory iteration and individual file lookup. In directory iteration, software identifies the directory to be accessed and then requests a listing of that directory’s contents; individual file lookup passes the name of the file being sought. Case-sensitivity on storage determines whether the file system converts all file names to a common case (case-insensitive, such as old FAT working entirely in capitals), or stores them using the capitalisation originally provided (case-sensitive, such as HFS+).

When a file system iterates through a directory, it can be insensitive to case, returning file names using a fixed convention such as MS-DOS returning all in upper case, or it can return the file name as originally given, preserving case. Individual file lookup can also be case-sensitive, in which case it will only match the name exactly as stored, or it can be insensitive, ignoring that case when finding a match.

Normal HFS+ is case-sensitive on storage, but case-insensitive on lookup. When you create a new file, this involves a lookup operation. If you were to try expanding a compressed archive that came from a Unix file system that was both case-sensitive and case-preserving, and contained MyName.text and MYNAME.TEXT, normal HFS+ will create a file with the name of whichever is unarchived first, say MyName.text, and save its contents under that name. However when the archiving software tries to create MYNAME.TEXT, an error will be returned, reporting that the file MyName.text already exists. If the software presses on and saves the contents of MYNAME.TEXT, it will then appear in the file named MyName.text – which could prove very confusing.

Switching normal HFS+ to its case-sensitive form (which is case-sensitive on lookup, as well as storage) involves more than just throwing a switch in the file system. All the catalogue indexes need to be rebuilt to alter this behaviour, so this requires the volume to be initialised. Because the great majority of applications and bundled components in OS X assume that they will be running on the normal case-preserving but case-insensitive variant, you are advised not to make a startup volume HFS+ case-sensitive, as problems can result. However if you need case sensitivity, as you will when working with many Linux or Unix software sources, you should initialise a non-startup volume to contain those files.

Forbidden characters

Either variant of HFS+ copes with almost any Unicode character, including Cyrillic, Chinese, and Kanji characters, but has particular issues over colons ‘:’ and slashes ‘/’. These result from Mac OS X’s inheritance from both Classic Mac and Unix operating systems. On Classic Macs, the colon was used to separate volume and folder names when giving the full path to a file, such as MacHD:Documents:MyName, and this lives on in the Finder. Try changing a file or folder name to include a colon, and the Finder will refuse to permit it.

Instead, you can use the slash that is widely used by file systems as the separator between volume and folder names in full paths, such as MacHD/Users/HOakley/Documents/MyName.text – but only in the Finder. At the command line, in Terminal, any slashes that you saw in folder and file names will be shown as colons, which have no special meaning in Unix file paths. So a file named “Photos 1/12/09” in the Finder will be known as “Photos 1:12:09” to Unix tools – something you must bear in mind when writing command scripts, but not pure AppleScript!

The only other oddity about file and folder naming in OS X is the use of spaces. Space characters are perfectly acceptable in volume, folder and file names, but can still occasionally give rise to strange problems when used in volume naming. You should therefore try to avoid naming any volume with an embedded space, although most of the time this will cause no harm (and is the default on OS X startup volumes).

Spaces can though become a pain when you need to type shell commands in Terminal, because the space character is used to separate elements within a command. If you want to list the contents of the folder named “My Images”, you will need to use a shell convention to ensure that the space is properly recognised. Options normally include:
ls My\ Images (where the backslash ‘escapes’ the next character)
ls ‘My Images’
ls "My Images"

Tools: Bulk Conversion

The two most common situations in which you might want to convert a lot of files from one naming convention to another are when importing files that contain colons (or slashes, depending on whether you see them in the Finder or Terminal), and exporting from HFS+ to a file system that is more restrictive in the characters that can be used in names, such as FAT32.

Although you can use the Finder’s Find command in its File menu to look for file and folder names containing forbidden characters, discovering more than a handful leaves you with the tedious and error-prone task of manually renaming each. If you have a few hundred left over from an earlier naming convention that, say, embedded dates as 24/09/95 into every name, then you need an automatic tool.

Tools to rename batches of files are detailed here. Those who fancy their command line skills can of course use regular expressions which are detailed here.

Techniques: Going Case-Sensitive

Some Unix and Linux tools have source code files that require case-sensitive handling, as they include two or more files whose names would appear the same to normal case-insensitive HFS+. If you have a volume already running the case-sensitive variant of HFS+, this will present no problem. If you do not, then you will need to create one.

Unless you will be working regularly with such problem source code files, it is probably simplest to create a case-sensitive disk image, unarchive the source code files into that disk image, compile them there, and (assuming that the compiled products do not require case sensitivity) install them on your normal HFS+ volume.

Using Disk Utility, choose its Blank Disk Image… command from the New item in the File menu. Set the Format popup to read “Mac OS Extended (Case-sensitive)” and the other entries in the New Blank Image as appropriate to your requirement. It is usually best to make the image single partition, with a GUID partition map. When you click on the Create button, the desired disk image will be created and mounted ready for use.

To build a Unix tool, copy the compressed source archive – typically Tarred up and then GnuZipped, .tgz or .tar.gz – to your case-sensitive disk image and decompress it there. Follow the configure or build instructions to compile, link, and finally install the tool, ensuring that it is installed back in the appropriate folder on your startup volume. Although not unheard of, you are very unlikely to come across a tool that requires to be installed on a case-sensitive file system, although quite a few contain potential name conflicts in their source code.

Summary

  • Different file systems, such as Mac Extended (HFS+), FAT, and NTFS, have different rules to determine permissible folder and file names.
  • When moving files to and from different file systems, you may need to rename them so that they conform to the rules.
  • Normal HFS+ is case-preserving but not case-sensitive. When initialising a volume, you can opt for it to be case-sensitive instead.
  • You should still by default initialise Mac volumes to HFS+ but not case-sensitive, especially when used to start your Mac up.
  • If you need case-sensitivity, when building Unix or Linux tools, use a working disk image initialised to case-sensitive HFS+.
  • The best tool for renaming large batches of files, for migration, is A Better Finder Rename (Better Rename in the App Store), although the Finder is now much more helpful for this.
  • The Finder does not accept colons in file names, whilst Terminal does not accept slashes; slashes seen in Finder names become colons in Terminal.
  • Where possible, avoid embedding spaces in volume names.

Updated from the original, which was first published in MacUser volume 25 issue 24, 2008.