Last Week on My Mac: Swallowing a fly

Computing has for me been an endless succession of rabbit holes, and just at the moment I’m going deeper all the time. It all started with wanting accurate benchmarks for SSD performance and my app Stibium, and now I find myself in linear programming and matrices, which is why you haven’t had any of the promised updates to Stibium yet. Let me make my excuses.

When I was developing Stibium, I came across speed measurements which were anomalous, to say the least, and were dominating the results returned by some other disk benchmarking apps. These were so fast – from 9 GB/s upwards – that the choice was stark: either the M1 Mac’s hardware was breaking the laws of physics, or the figures were spurious, and reflected transfer in memory, not with an SSD. With the help of others, including Ric Ford of MacInTouch in particular, we came to recognise these as outliers which could and did cause havoc with benchmarks.

The engineering answer to this might have been to identify results which are outliers, remove them, and carry on without. Although pragmatic, you then have to decide how you might identify those outliers, and the only real answer is because they seem higher (or in other cases, lower) than other results. This goes against the grain of most benchmarking, which is directed at demonstrating how much faster and better your new system is compared to those of your competitor. Coming up with unbiased and objective rules for classifying outliers is therefore extremely contentious.

So I turned to statistics for a solution, in what’s known as robust linear regression. Conventional linear regression – fitting a straight line to a series of points by the method of least squares – is quick and simple, but too easily swayed by outliers. By using a robust method, the influence of those outliers would be reduced, and the line fitted more closely to where most of the points are.

The current version of Stibium does just that, in an ingenious method devised by Theil and Sen. In turn, it finds every possible pair of points through which a line could be fitted, and for each of those works out the equation of that individual line. It then calculates the median gradient of all those lines, and offers that as the gradient of the line which best fits all the points. And it works well in the face of outliers.

If you thought that was the end of this rabbit hole, it’s only the start.

Theil-Sen regression isn’t the only game in town, not by a long way. More popular and arguably far superior is a group of techniques known as quantile regression. For some reason, those who like Theil-Sen and other robust methods don’t seem to talk with those who are into quantile regression, and the other way around. Finding any direct comparisons hasn’t been possible, which seems odd, as they start from the same place, just follow different routes to get to a similar goal, but like rival bus companies, one pretends the other doesn’t exist.

Instead of analysing all the possible lines, quantile regression uses linear programming to optimise the best line of them all. You may have vague recollections of linear programming from graphical examples in the distant past. It’s quite an old method which manipulates matrices – those two- and more-dimensional arrays of numbers which seem to have been designed to make serious maths a minority sport. You may be surprised to learn that macOS features superb support for matrix maths, in what Apple terms its Accelerate libraries.

More than a generation ago, when I was crunching numbers heavily in Classic Mac OS, the predecessors of Accelerate were known as SANE, the Standard Apple Numerics Environment. I still have a hardbound copy of the second edition of its reference manual, published by Addison-Wesley in 1988.

Working with matrices is understandably more complicated, so my next task is to get my head around the strengths and limitations of the Accelerate libraries, and work out how I can best use them to support linear programming, so that I can use quantile regression, and compare it with Theil-Sen linear regression, and decide which I should use in Stibium, to ensure that its results are as accurate as possible and not dominated by those outliers which were the start of all these problems.

Or to put into song,
There was an old lady who swallowed a cow;
I don’t know how she swallowed a cow!
She swallowed the cow to catch the goat,
She swallowed the goat to catch the dog,
She swallowed the dog to catch the cat,
She swallowed the cat to catch the bird,
She swallowed the bird to catch the spider
That wriggled and jiggled and tickled inside her!
She swallowed the spider to catch the fly;
I don’t know why she swallowed a fly – Perhaps she’ll die!
Wikipedia.

The trick, I believe, is to stop short of the horse.

11Comments

Add yours

1

Andrew Reilly on February 21, 2021 at 8:21 am

I’m not sure about Stibium, but as you know, modern OSes are complex and sophisticated beasties. If there’s something to be done over and over, and it’s expensive, it’s worth a lot of complexity to figure out how to avoid doing as much of it as possible. Hence caches (processor, file and block), cache replacement strategies, look-ahead prefetching, log-structured file systems, flash wear leveling: it’s abstractions all the way down. If you’re trying to measure the performance of the very last step, then you’re also trying to figure out how to defeat all of the layers of work-arounds that have been cleverly developed to avoid needing to do that last step (read or write to flash).

Alternatively, if you’re interested in application performance, as seen by applications, then all of the short-cuts apply, and your performance outliers might actually be the main game. Fastest SSD read is the one you don’t have to do.

LikeLiked by 1 person
- 2
  
  hoakley on February 22, 2021 at 12:09 am
  
  Thank you.
  I’ll be explaining more in a further article.
  Howard.
  
  LikeLike
3

Aron D. on February 21, 2021 at 11:46 am

funny song – in my language though swallowing a fly (by a woman) is a synonym for getting pregnant

LikeLiked by 1 person
- 4
  
  hoakley on February 22, 2021 at 12:09 am
  
  Thank you.
  I’m sorry the article must have disappointed you!
  Howard.
  
  LikeLike
5

Duncan on February 21, 2021 at 12:28 pm

I don’t think SSD performance can be truly measured until your testing algorithm gains sentience.

LikeLiked by 1 person
6

Javier Gallardo on February 21, 2021 at 6:06 pm

…When I read you about struggling into logic, finding inconveniences, digging deeper… sometimes I dare to wonder if some important changes are happening in physical, electric, or material architectural base. I know programs just should work, as they’re just logic; but maybe when applied to measure “material” data, physical changes matter.
Things like: external bootdisk/usbA, not /usbC… the hodoo-voodoo to reach “fallback recovery mode” (gaming with electrical capacitors‽)… when some disk-control chips work, but not others (perhaps with differences per mac model)… …make me think about physical facts that need to be deeper known. I think Apple is not giving all information needed, or Apple is still investigating (with the help from users).

LikeLiked by 1 person
- 7
  
  hoakley on February 22, 2021 at 12:13 am
  
  Thank you.
  I think this is because systems have become so complex that they no longer behave predictably. I first noticed it with the Commodore Amiga in 1987, and it has steadily grown worse since then.
  Howard.
  
  LikeLike
8

Aron D. on February 21, 2021 at 9:25 pm

by the way, have you heard or experienced yourself the SSD high writes issue on M1 macs as posted here? https://linustechtips.com/topic/1306757-m1-mac-owners-are-experiencing-extremely-high-ssd-writes-over-short-periods-of-time-likely-thanks-to-aggressive-swap/

LikeLiked by 1 person
- 9
  
  hoakley on February 22, 2021 at 12:15 am
  
  Thank you.
  No – neither. However, I’m not convinced that what they think they’re seeing is really what they interpret it as.
  Howard.
  
  LikeLike
10

John Blommers on February 21, 2021 at 11:32 pm

If your experiment needs statistics, you ought to have done a better experiment.

Ernest Rutherford

LikeLiked by 1 person
- 11
  
  hoakley on February 22, 2021 at 12:17 am
  
  Thank you.
  As a nuclear physicist, he could get away with that, so long as he could rely on maths, which is all that most of those techniques are – indeed, I don’t think that anyone would claim that linear programming is statistics, and linear regression is not more than maths either.
  Howard.
  
  LikeLike