So far in this series, I have concentrated on how pressing keys enters text, and how you can customise and extend that. We also use the keyboard for other purposes, best summarised as commands. These include startup key combinations (Intel-only), global commands like Command-Option-Escape for the Force Quit dialog, menu shortcuts within apps, and others. To understand how they work, I’ll explain how a keypress is handled in macOS, using the following diagram.
Pressing any key on a connected keyboard sends a code to the keyboard device driver in the Mac, as a low-level event. This is passed first to the kernel and its extensions, where certain key combinations are detected and actioned, such as those of the Power and Eject buttons, reserved keys. There’s also an opportunity for background processes to acquire low-level events at this stage, through event taps.
Normally, all other low-level key events are then passed to WindowServer. Because that composites windows together to form what’s displayed, it’s the only process that knows which window and app is active and at the front. WindowServer adds a timestamp to the event, and translates it into the Unicode equivalent for that key, before dispatching the full keyboard event to the process that needs to handle it, generally the application whose window is currently at the front and active.
When an app processes a keyboard event passed to it by WindowServer, the first thing it normally looks for is whether there’s a key equivalent. These should consist of the Command modifier and one or more other keys, such as Command-C for copy, which performs an action defined by the app’s menus. Apple states that key equivalents should consist of only the Command modifier, a single character, and optionally the Shift modifier.
Next, the app should detect whether the event is one of the keyboard interface controls, such as the Tab key to make the next interface element active. These normally include Tab, Shift-Tab, Control-Tab, Control-Shift-Tab, Space, the arrow keys, Option and Shift.
If that doesn’t apply, the key event is then sent to the active window. That checks whether the event is a keyboard action, such as Page Down. If it is, then the key action is sent to what’s termed the first responder, and so on through the responder chain. If there’s no keyboard action, it’s presumed the key event is to be treated as text entry of that Unicode character, which is then inserted in the text currently being edited.
Responders are normally user-interface objects that handle events. The first responder is the first such object in a chain. For instance, when you’ve selected an editable text box, that becomes first responder, and is normally sent keyboard events for it to perform. If the first responder doesn’t handle a keyboard event, then the event is passed up the responder chain, going to the next view up in the hierarchy. If the first responder is an editable text box that doesn’t handle that key event, it’s passed to the next view up in the view hierarchy, as the next responder.
In the normal run of things, the order in which keyboard events are handled is:
- reserved keys, such as Power, Eject
- event taps, received and processed by background processes
- WindowServer translation and dispatch
- app key equivalents, including menu shortcuts
- app keyboard interface controls, such as Tab between controls
- app keyboard actions, such as page down
- app text insertion.
The handling chain offers developers different opportunities to modify behaviours. Among these are a system extension (formerly a kernel extension) to modify the keyboard device driver, which can even remap reserved keys, and gathering event taps in a background process. However, once a key event has been dispatched by WindowServer to an app, there’s little that third-party software can do unless it uses an interface exposed by that app.
Finally, WindowServer’s central role in this explains why, when it’s in trouble, keyboard input may become erratic or fail altogether.
Further details are provided in Apple’s Cocoa Event Handling Guide. Although that’s the current version, it hasn’t been updated since 2016.