How QuickLook Preview doesn’t tell Apple about images

Since I published my detailed explanation of why there are reports that macOS might appear to send Apple details of images you browse in the Finder, rumour has continued unabated. Most recently, while agreeing with my conclusions, @mysk has claimed this was the result of a bug, and has been fixed in macOS 13.2. This article looks at just what does happen in Live Text, and what it’s doing contacting Apple.

What do we know?

The story so far is of a single claim that, when browsing images in Ventura’s Finder, mediaanalysisd tried to make an outgoing connection to an Apple server, as revealed by the software firewall Little Snitch. Following replication of that by using QuickLook Preview, I then demonstrated that the outgoing connection was performed during image analysis for Live Text. Thus the facts are:

Simply browsing images in the Finder doesn’t elicit any outgoing network connections.
However, opening an image in QuickLook Preview (pressing the Spacebar) can instigate an outgoing network connection.
Opening an image in QuickLook Preview can trigger the OCR process in Live Text, but not Visual Look Up.
Live Text doesn’t compute any neural hashes for an image, but analyses the image for possible text content, which is different.
Any outgoing network connection during Live Text OCR therefore cannot send Apple any identifiers that could be used to check an image for CSAM or any other content.
Thus, the claim that browsing images in the Finder could be used to check for CSAM has no factual support.

This is fully consistent with Apple’s most recent statement on checking images for CSAM, as reported late last year in Wired, but carefully omitted by those spreading rumours. Specifically, Apple is there reported as stating:
“We have further decided to not move forward with our previously proposed CSAM detection tool for iCloud Photos. Children can be protected without companies combing through personal data, and we will continue working with governments, child advocates, and other companies to help protect young people, preserve their right to privacy, and make the internet a safer place for children and for us all.”

Despite this, some still claim that the outgoing connection during LiveText OCR somehow provides information about an image to Apple. This article therefore analyses logs obtained from this event immediately following updating of a virtual machine to macOS 13.2.

Image analysis

When you press the Spacebar, QuickLook recognises you want to display a preview in a floating window, the start of this process.
7.244339 com.apple.quicklook previewView:<private> didShowPreviewItem:<private>

Because this is a full preview rather than a Finder thumbnail, Live Text is supported, but not Visual Look Up. VisionKit gets to work starting image analysis.
7.776088 com.apple.VisionKit BEGIN "VKImageAnalyzerProcessRequestEvent"

For this, mediaanalysisd is spawned, set up for XPC access, and passed the request for image analysis.
7.801901 com.apple.mediaanalysisd internal event: WILL_SPAWN, code = 0 7.805015 com.apple.mediaanalysisd Successfully spawned mediaanalysisd[1605] because ipc (mach) 7.824079 mediaanalysisd Received on-demand image processing request (CVPixelBuffer) with MADRequestID 1

Because this is being run in real-time, it’s given a high QoS.
7.827881 mediaanalysisd Run <private> (1) [QoS: 25 Cost: 1.000]; remaining resource: 0.00 7.828544 mediaanalysisd VCPMADVIDocumentRecognitionTask running...

CoreML and Espresso then get going with the analysis using neural networks. A small sample of many similar log entries is given here.
7.838437 com.apple.espresso Creating context 5043897904 eng=5 dev=-3 7.838482 com.apple.espresso Creating plan 5033208320 7.848279 com.apple.espresso espresso_plan_add_network plan=5033208320 path=<private> cp=65552 Completed 7.851370 com.apple.coreml <private> class has successfully loaded the model at <private>. 8.334736 com.apple.espresso [change_input_shapes] index=0/1 name=<private> w=200 h=32 k=1 n=4 seq=1 8.395860 com.apple.espresso espresso_plan_add_network plan=5033345536 path=<private> cp=65552 Completed
Note the time gap of nearly 0.5 seconds, during which image analysis is taking place using neural networks.

Language modelling

This doesn’t generate any neural hashes, but if it’s successful in discovering what may be text within the image, LanguageModeling is initiated to decipher that text. Again this is a small sample of many similar entries.
8.396966 com.apple.LanguageModeling Options is updating <private> from 1 to 0 8.397006 com.apple.LanguageModeling Creating CompositeLanguageModel (<private>): <private> 8.398759 com.apple.LanguageModeling NgramModel: Loaded language model: <private> 8.399817 com.apple.Lexicon Lexicon <private>

Linguistic data

It’s at this early stage that macOS checks the currency of local linguistic data to use in text recognition. Significantly, the client is given as mediaanalysisd, which may well be identified as the parent process by a software firewall.
8.403145 com.apple.mobileassetd Creating client/daemon connection: 2A0DE96A-A3AD-4A5B-BD0A-4D3B99BC8F91 8.403240 com.apple.mobileassetd -[ControlManager handleClientConnection:on:]_block_invoke: assetType: com.apple.MobileAsset.LinguisticData client: mediaanalysisd, command: 1 (MA_QUERY_ASSET_TYPE)

This relies on a catalogue kept in /System/Library/AssetsV2/com_apple_MobileAsset_LinguisticData/com_apple_MobileAsset_LinguisticData.xml, with numerous .asset files in sub-folders. If you care to look there, you will see the large number of such files. On one of my Macs there are more than 90 folders, each containing a further hierarchy of language-specific files.
8.403331 com.apple.mobileassetd -[ControlManager determineAssets:clientName:connection:downloadingTasks:message:resultTypes:queryArray:isForSpecificAsset:specificAssetId:specificAllowedDifferences:]: mediaanalysisd queried for: com.apple.MobileAsset.LinguisticData with returnType of: 2 with Purpose: (null) 8.403415 com.apple.mobileassetd -[ControlManager newCatalogLoad:withPurpose:]: Catalog fileLocation: /System/Library/AssetsV2/com_apple_MobileAsset_LinguisticData/com_apple_MobileAsset_LinguisticData.xml 8.454579 com.apple.mobileassetd dataFillInstalledWithPurpose: Path to asset dir: /System/Library/AssetsV2/com_apple_MobileAsset_LinguisticData/495dbc280e6494f8635c3f2e7797e0ccf53546d3.asset 8.454627 com.apple.mobileassetd -[ControlManager determineAssets:clientName:connection:downloadingTasks:message:resultTypes:queryArray:isForSpecificAsset:specificAssetId:specificAllowedDifferences:]: mediaanalysisd queried for: com.apple.MobileAsset.LinguisticData with returnTypes 2 (MAUnionOfCatalogInstalled) and found 0 assets with result 0 (MAQuerySuccessful) --> From 743 listed in the catalog and 1 local (1/1 downloaded, 0 preinstalled)--> Catalog info ({ isLiveServer = 0; }) --> Filtered for MAUnionOfCatalogInstalled to 0 in catalog (0 installed, 0 server-only, 0 preinstalled), 0 installedNotInCatalog, 0 installedWithOS, 0 requiredByOS; the query params are: [ AssetLocale:'en' and AssetType:'Delta' and _CompatibilityVersion:'11' and _SupportedPlatforms:'macOS'] --> Merged to 0 assets 8.454970 com.apple.DataDeliveryServices assetsForQuery: <query: com.apple.MobileAsset.LinguisticData, locO: 1, iO: 1, latO: 1, <filter: { AssetLocale = "{(\n en\n)}"; AssetType = "{(\n Delta\n)}"; }>> final result: ( ) was cached: 0, cachedOnly: 0

A similar sequence adds another language.
8.506237 com.apple.DataDeliveryServices assetsForQuery: <query: com.apple.MobileAsset.LinguisticData, locO: 1, iO: 1, latO: 1, <filter: { AssetLocale = "{(\n ja\n)}"; AssetType = "{(\n Delta\n)}"; }>> final result: ( ) was cached: 0, cachedOnly: 0

On this occasion, because macOS has been updated so recently, no .asset files had to be downloaded or updated, and LanguageModeling is able to proceed without making any network connection.
8.611597 com.apple.LanguageModeling NgramModel: Loaded language model: <private> 8.613612 LanguageModeling Creating CompositeLanguageModel (<private>): <private> 8.624175 com.apple.LanguageModeling NgramModel: Loaded language model: <private> 8.654266 com.apple.coreml <private> class has successfully loaded the model at <private>. 8.668123 com.apple.LanguageModeling NeuralNetwork: Loaded neural language model: <private>

A similar series of entries records any translation of recognised text, which may again involve downloading updated translation information from Apple’s servers.

That lets the document recognition task complete, and visual search gating too.
8.682162 mediaanalysisd VCPMADVIDocumentRecognitionTask complete 8.682166 mediaanalysisd VCPMADVIVisualSearchGatingTask running... 8.940130 mediaanalysisd VCPMADVIVisualSearchGatingTask complete (0)

Reporting

Finally, 1.2 seconds after the start of image analysis, VisionKit reports it complete.
8.942336 com.apple.VisionKit Completed MRC Parsing of 0 elements in 0.000000 seconds. 8.945134 com.apple.VisionKit VisualSearchGating: Request completed: <private> 8.945151 com.apple.VisionKit VisualSearchGating: Request completed: <private> 8.945313 com.apple.VisionKit END "VisionKit MAD Parse Request" 8.945355 com.apple.VisionKit Request completed: <private> 8.945396 com.apple.VisionKit Calling completion handler For Request ID:1 Total Processing Time 1169.34ms Has Analysis: YES TextLength: 65 DD: 0, MRC: 0, VS:0 request: <private> Error: (null)

Reported total processing time was 1.17 seconds, and yielded text of length 65 (presumably Unicode code points). It’s at this point that the Live Text icon appears at the bottom right of the preview window, and can select all the recognised text in that image.

Visual Look Up

The sequence of log entries above is quite different from those I have described for Visual Look Up (VLU). Most obvious is the fact that the final phase in VLU, Visual Search, doesn’t take place. That’s marked not by the VisionKit Analyzer process seen in both VLU and Live Text, but by VisionKit’s MAD Visual Search which is confined to VLU. It’s then that neural hash(es) computed during analysis are sent to Apple servers by mediaanalysisd, for the servers to return information about image matches and content.

Conclusions

Live Text analysis doesn’t generate neural hashes or other identifiers for an image, in the way that Visual Look Up does.
Any connection to Apple’s servers during Live Text analysis is performed before the image has been analysed, and before the extraction of any text. It cannot, therefore, send Apple any image identifiers or extracted text.
Live Text relies on language asset files, which may need to be augmented or updated over a network connection during text recognition.
macOS 13.1 and 13.2 perform Live Text essentially the same, and will both attempt to connect to Apple’s servers in the event that they need to update language asset files.
Users may encounter outgoing connections when opening a local image in QuickLook preview, but can have confidence that it’s not being used to send Apple or any third party image identifiers or extracted text from the image.
Blocking outgoing connections used in Live Text will only result in poorer text recognition, and cannot affect the user’s privacy.

As ever, I welcome other factual evidence.

11Comments

Add yours

1

Henry Chandler on January 27, 2023 at 8:19 am

Hi Howard

Why doesn’t it fit the space like other messages?

All the best

Henry

>

LikeLiked by 1 person
2

Alan B on January 27, 2023 at 8:23 am

I’ll have to read the article again to fully understand it but it does seem to put to bed the claims made by certain conspiracy theorists. Well done.

LikeLiked by 1 person
3

Henry Chandler on January 27, 2023 at 11:27 am

Why doesn’t this email fit the “page” or “screen”. I can’t read about a quarter of the Right Hand Side on my iMac.

LikeLiked by 1 person
- 4
  
  hoakley on January 27, 2023 at 12:01 pm
  
  Sorry, Henry, I didn’t realise you’re referring to one of the automatic email messages.
  I’m sorry, I have no control over them – they’re generated automatically by WordPress. I suspect what may have thrown this one is that it contains long pathnames and log contents. All I can suggest is that you read the original in your browser.
  If the problem persists with others, please let me know and I’ll raise a support issue with the WordPress engineers.
  Howard.
  
  LikeLike
5

Jeff Johnson on January 27, 2023 at 1:12 pm

For me on Monterey, mediaanalysisd literally never connects to the internet, as confirmed by Little Snitch. I have Live Text enabled (but Siri Suggestions disabled).

LikeLiked by 1 person
- 6
  
  hoakley on January 27, 2023 at 1:49 pm
  
  Thank you. I think that’s general experience unless you enable VLU.
  Howard.
  
  LikeLike
  - 7
    
    Jeff Johnson on January 27, 2023 at 2:17 pm
    
    This reply seems contrary to what you wrote in the article: “Blocking outgoing connections used in Live Text will only result in poorer text recognition, and cannot affect the user’s privacy.” What evidence do you have of that? AFAICT there shouldn’t be any outgoing connections used in Live Text, and the only reason one occurred on Ventura was due to a bug that is now fixed.
    
    By the way, I got subscribed to email updates for my previous comment even though I left “Notify me of new comments via email” unchecked.
    
    LikeLiked by 1 person
    - 8
      
      hoakley on January 27, 2023 at 3:15 pm
      
      Thank you, Jeff.
      The only outgoing connections that occur during Live Text are to update linguistic data, as explained in the article above. If you block those, then your Mac won’t get that updated data, which is supplied in order to improve linguistic analysis, hence to improve text recognition in the Live Text feature. Those updates, and outgoing connections, are effectively chance events, and can occur at any time that linguistic analysis is performed. AFAIK they’re not specific to Live Text, but to that subsystem when it’s used.
      “the only reason one occurred on Ventura was due to a bug that is now fixed”
      What evidence do you have of that? I’ve looked through the log for Live Text in Monterey, and in 13.1 and 13.2. There’s no evidence there of any such bug. Neither was it listed among the woefully brief list provided by Apple for the 13.2 update. So I’d love to know what evidence there is.
      I’m sorry, I have no control over comment update subscriptions. If it troubles you again, please let me know and I’ll raise it with WordPress support. Although knowing WordPress, the behaviour will have changed tomorrow anyway!
      Howard.
      
      LikeLike
9

Jeff Johnson on January 27, 2023 at 3:33 pm

I’m not sure whether I’m replying to https://eclecticlight.co/2023/01/27/how-quicklook-preview-doesnt-tell-apple-about-images/#comment-82853 or to the top level, because the Reply button doesn’t appear in deeply nested comments. Anyway, this is going to be my last reply, since WordPress is getting very annoying.

“The only outgoing connections that occur during Live Text are to update linguistic data, as explained in the article above.” This is your claim, but I see no reason to believe it. On Monterey, Little Snitch shows that mediaanalysisd literally never attempts to connect to the internet, even though Live Text is enabled and works.

“I’ve looked through the log for Live Text in Monterey, and in 13.1 and 13.2. There’s no evidence there of any such bug.” The bug claim is not from me, the bug claim was from the @mysk link at the beginning of your article.

LikeLiked by 1 person
- 10
  
  hoakley on January 27, 2023 at 6:11 pm
  
  Thank you, Jeff.
  I accept full responsibility for comment nesting/threading, and have control over that. It’s a problem you’re only too familiar with: you can’t please everyone, and sometimes it seems like you can’t please anyone. While I’d like to allow deeper nesting, which delays this problem from occurring, whenever I do allow a deeper setting I get torrents of comments from angry users who say that it makes those comments unreadable, because of the limited width of their browser window. I can’t win.
  As to the issue of outgoing connections, compare carefully what I have said, and what you have. I specifically refer to “only outgoing connections”, as the log doesn’t tell me to which process a software firewall might attribute any connection. Furthermore, I know if I only considered connections from specific processes, then others would accuse me of missing those sneaky, disguised connections that Apple must be making.
  However, you’re referring specifically to outgoing connections that Little Snitch attributes to the mediaanalysisd process, which is likely to exclude some of the connections that I am seeing. The log contains full details of all outgoing connections, some of which you may not be aware of at all.
  Regarding the claim of a bug in 13.1 that was fixed in 13.2, I have seen no evidence whatsoever put forward to support that claim, other than one outgoing connection appearing to be empty. As I’m not seeing any change in the log, I fail to see how it was a bug, nor how it has been fixed, and suspect those are unevidenced suppositions.
  Howard.
  
  LikeLike
11

Michael Tsai - Blog - Network Connections From mediaanalysisd on January 27, 2023 at 7:53 pm

[…] Howard Oakley: […]

LikeLike