Interested in words and language? Nalaprop 1.1 might help

I’m delighted to release a new version of my free text analysis utility Nalaprop. The main change in version 1.1 is that you can now increase and decrease text size in all its three panels together. Nalaprop analyses text into different scripts and languages, and for those supported shows parts of speech. It also compiles a frequency list of words used according to their part of speech, which can valuable even when working in a single language.

Mojave and later versions of macOS include sophisticated features which analyse natural languages in several different ways. To start with, I’ll explain these using the multi-language sample file which is included with Nalaprop: nalapropMultiTest1.txt. Nalaprop opens plain text files directly, the text content of Rich Text files, or you can paste or type text for analysis into its left panel. To start this demonstration, simply open that sample file using the Open command.

nalaprop01

When it opens a text file, the contents are automatically analysed and displayed in two of its panels. At the left is the original text, and in the middle is the same text with each word coloured according to its part of speech, such as noun and verb. The key to the colours used is at the right.

Scroll down through the centre panel and you’ll notice that this file consists of multiple languages, a total of 20. Some of them are shown in multiple colours, indicating that macOS has been able to recognise that language and parse it into parts of speech. There’s support for this in at least eight languages: English, French, Spanish, German, Italian, Portuguese, Russian, and Turkish. Language support may require that your Mac downloads additional data for macOS before that language can be parsed into parts of speech. One way to do this is to add the language to the list of Preferred languages in the General tab of the Language & Region pane.

To perform a wider range of analyses, click on the MultiParse button at the top of the window. After a short delay during which macOS analyses the text in more detail, the popup menu at the top of the middle panel is enabled, and the colours of the text in that panel change, as does the colour key at the right.

To see the sample text analysed for its script (alphabet), select Script in that popup menu. This displays each script in a different colour, which includes Latin, Russian, Arabic, and more. Even those languages which macOS is unable to parse into parts of speech are recognised and labelled.

nalaprop02

Repeat this by switching the popup menu to Language, and you’ll see that macOS is excellent at recognising almost all of the languages, although some of the East Asian languages and Esperanto at the end of the samples make it struggle.

There are two entries in the popup menu for Types: the upper of the two uses colours different from the default, and the lower is the same as when you first opened the file.

The last of the options in the popup menu shows different text from the original, at the left: these are Lemmas, word roots without grammatical alteration (inflection). Instead of the verb was, it displays the verb root be, and so on. This can be very helpful if you’re looking at the frequency of words, or trying to translate them, for instance.

Nalaprop also performs word frequency analysis, and can do that either on the words used or on their lemmas. To test this out, you’ll need a fairly long plain text document: here I’ll use a copy of Charles Dickens’ novel A Christmas Carol. Open that using the Open command, which should automatically parse it into parts of speech. Then click on the List button at the top right of the window.

nalaprop03

All the words found in the document are then summarised by part of speech, starting with verbs. For each word found, the number of occurrences is given in parentheses. Because this analysis is performed on unmodified words, you’ll see words from the same lemma, like afford and afforded, counted separately.

Nalaprop can be smarter than that: click on the MultiParse button, and wait while macOS works through the whole document again; this can take a few moments, and you may see the spinning beachball briefly, which indicates it’s busy and not hung. When that completes, click on the List button again.

nalaprop04

The contents of the view at the right now shows the frequency not of individual inflected words, but of their roots. In English, for example, you shouldn’t see the words am or was listed, but they’re now included in the lemmatised form be.

You can save the word frequency list (but neither of the other views) by clicking on the Save button.

Nalaprop has an extensive Help book available, also provided as a separate PDF, and two colour keys which help you see the colour palettes in detail. You can use its Find command to turn it into an interactive concordance, based on the word frequency list.

In case you’re wondering, Nalaprop correctly recognises there as a pronoun, their as a determiner, and they’re as a pronoun and verb, and even gives their lemmas respectively as there, they and they-be. Your Mac really is smarter than its spell-checker.

Nalaprop version 1.1 is available from here: nalaprop11
from Downloads above, from its Product Page, and through its auto-update mechanism.

Enjoy!

Thanks to Larry for twisting my arm to fix this.

12Comments

Add yours

1

joffday on May 5, 2022 at 8:55 am

Brilliant! Thanks for all the hard work on this.

LikeLiked by 1 person
- 2
  
  hoakley on May 5, 2022 at 5:41 pm
  
  Thank you.
  Howard.
  
  LikeLike
3

EcleX on May 5, 2022 at 10:14 am

Many thanks!

LikeLiked by 1 person
4

artiste212 on May 5, 2022 at 5:43 pm

What a wonderful app to play with! Thank you, Howard.

I was suitably impressed that the OS is also able to distinguish when the same word is being used as a noun or a verb. I’m curious, however, as to why the second panel can’t be saved and can’t be copied in color — something to do with how MacOS analyzes these texts? I am, of course, able to convert a screenshot into a full color, OCR’ed PDF, so I have no pressing need for this functionality. Just curiosity.

LikeLiked by 1 person
- 5
  
  hoakley on May 5, 2022 at 5:50 pm
  
  Thank you.
  I’ve just realised that you can save the middle panel if you’re careful. Ensure the cursor in the text there, e.g. by selecting a word within it. Then use Save As… and select Rich Text as the file format.
  Sometimes my code works better than I had realised!
  Natural language support in macOS is really excellent – one of the lesser-known strengths. I’m surprised that so few apps seem to use it.
  Howard.
  
  LikeLike
6

Christian on May 6, 2022 at 9:53 am

Oh I wish I had this app when I wrote a book fifteen years ago! At that time, I had to make numerous stats in Excel documents… ;-)
Thanks.

LikeLiked by 1 person
- 7
  
  hoakley on May 6, 2022 at 11:25 am
  
  Thank you. Sorry to be so late!
  Howard.
  
  LikeLike
8

Myob on May 6, 2022 at 11:43 pm

Thank you, Howard, yet again!

Hope things are going well for you. Best.

LikeLiked by 2 people
9

Graham Keith Rogers on May 8, 2022 at 4:29 am

Really useful. Like Christian I intend to use this for a book; but also for analysis of student texts (2nd language English users). One question, is it possible to order the List output so that high use is shown at the top?

LikeLiked by 2 people
- 10
  
  hoakley on May 8, 2022 at 10:57 am
  
  Thank you.
  Yes it is, although I’m not sure at the moment how easy or difficult that would be to code. I’ll take a look later today and get back with an answer.
  Howard.
  
  LikeLiked by 1 person
- 11
  
  hoakley on May 8, 2022 at 10:32 pm
  
  I looked at my source code, and it’s not trivial, but neither is it too onerous. I’ll see if I can fit that in later this week.
  Howard.
  
  LikeLiked by 1 person
  - 12
    
    Graham Keith Rogers on May 8, 2022 at 11:29 pm
    
    That’s wonderful to hear. The fact that you have taken the time to look is a encouraging.
    
    LikeLiked by 2 people