Exploring natural languages with Nalaprop

I recently updated my free linguistic analysis utility Nalaprop. This article is an introduction to its features, which use the advanced analysis functions built into more recent versions of macOS, specifically from Mojave onwards. If you’re in the least bit interested in language, or want to analyse any writing, then Nalaprop has features which are only available in expensive and specialised products – all free, courtesy of macOS.

Mojave and later include sophisticated functions which analyse natural languages in several different ways. To start with, I’ll explain these using the multi-language sample file which is included with Nalaprop: nalapropMultiTest1.txt. Nalaprop opens plain text files directly, or you can paste or type text for analysis into its left view. To start this demonstration, simply open that file using the Open command.

nalaprop01

When it opens a text file, the contents are automatically analysed and displayed in two of its views. At the left is the original text, and in the middle is the same text with each word coloured according to its part of speech, such as noun and verb. The key to the colours used is at the right.

Scroll down through the centre view and you’ll notice that this file consists of multiple languages, a total of 20. Some of them are shown in multiple colours, indicating that macOS has been able to recognise that language and parse it into parts of speech. In Catalina 10.15.6, there’s support for this for the following eight languages: English, French, Spanish, German, Italian, Portuguese, Russian, and Turkish (I suspect the last is limited).

Apple normally adds support for languages other than English when the first full version of each major version of macOS is released. If you’re running Nalaprop on a Big Sur beta, you’re therefore only likely to see English parsed for the time being, but that should change with the release of 11.0.

To perform a wider range of analyses, click on the MultiParse button at the top of the window. After a short delay during which macOS analyses the text in more detail, the popup menu at the top of the middle view is enabled, and the colours of the text in that view change, as does the colour key at the right.

To see the sample text analysed for its script (alphabet), select Script in that popup menu. This displays each script in a different colour, which includes Russian, Arabic, and more. Even those languages which macOS is unable to parse into parts of speech are recognised and labelled.

nalaprop02

Repeat this by switching the popup menu to Language, and you’ll see that macOS is excellent at recognising almost all of the languages, although some of the East Asian languages and Esperanto at the end of the samples make it struggle.

There are two entries in the popup menu for Types: the upper of the two uses colours different from the default, and the lower is the same as when you first opened the file.

The last of the options in the popup menu shows different text from the original, at the left: these are Lemmas, word roots without grammatical alteration. Instead of the verb was, it displays the verb root be, and so on. This can be very helpful if you’re looking at the frequency of words, or trying to translate then, for instance.

Nalaprop also performs word frequency analysis, and can do that either on the words used or on their lemmas. To test this out, you’ll need a fairly long plain text document: here I’ll use a copy of Charles Dickens’ novel A Christmas Carol. Open that using the Open command, which should automatically parse it into parts of speech. Then click on the List button at the top right of the window.

nalaprop03

All the words found in the document are then summarised by part of speech, starting with verbs. For each word found, the number of occurrences is given in parentheses. Because this analysis is performed on unmodified words, you’ll see words from the same lemma, like afford and afforded, counted separately.

Nalaprop can be smarter than that: click on the MultiParse button, and wait while macOS works through the whole document again; this can take a few moments, and you may see the spinning beachball briefly, which indicates it’s busy and not hung. When that completes, click on the List button again.

nalaprop04

The contents of the view at the right now shows the frequency not of individual inflected words, but of their roots. In English, for example, you shouldn’t see the words am or was listed, but they’re now included in the lemmatised form be.

You can save the word frequency list (but neither of the other views) by clicking on the Save button.

Nalaprop has an extensive Help book available, also provided as a separate PDF, and two colour keys which help you see the colour palettes in detail. You can use its Find command to turn it into an interactive concordance, based on the word frequency list.

I hope you find it useful.

3Comments

Add yours

1

Enrico Scarpella on September 7, 2020 at 5:39 pm

Dear Mr. Oakley,

Thank you very much for creating this invaluable tool and for generously making it available to everyone. I use it to analyze and improve my manuscripts, so thank you very much for that.

I hope you will not find me ungrateful if I make a suggestion. I find it very difficult, if not impossible, to distinguish the pink and red used in the app. Furthermore, sometimes I would only like to visualize the respective positions of two parts of speech, not all of them. I thought both could be addressed if the user were allowed to modify the colours used for the different parts of speech: I could change the pink to magenta, for example, and if I wished to visualize only two parts of speech, I could change the colours for all the other parts to black (I don’t use dark mode).

Of course, I have no idea if what I propose is even possible and how difficult it would be to implement it. In any case, it would certainly take time and energy, which I am aware are always limiting. That’s why I am particularly grateful for Nalaprop and would like to thank you very much again for selflessly making it available to everyone. Thank you for your consideration.

Best regards,
Enrico

LikeLiked by 1 person
- 2
  
  hoakley on September 7, 2020 at 5:48 pm
  
  Thank you.
  Yes, it’s something on my list of features to add. At present I’m still working hard to ensure that all my utilities are Universal Apps and run on Big Sur. I also never know how many people use any of them: now that I know you’d like that I’ll try to get round to adding it when I can.
  Howard.
  
  LikeLike
  - 3
    
    Enrico Scarpella on September 7, 2020 at 9:57 pm
    
    Thank you very much, Howard, for the prompt and kind reply and for taking my suggestion into consideration.
    
    Best regards,
    Enrico
    
    LikeLiked by 1 person

Share this:

Related