What are the penalties in real-world use for running your code on Icestorm cores, using around 10% of the power used by Firestorms?
How ARM64 uses its special SIMD registers in lanes, and how they can be loaded with and without de-interleaving.
Three recent WWDC sessions extolling Apple’s “extensive reference material” and Xcode can’t find anything on these rich and extensive libraries.
More cores are great for running more processes, but how can you make individual operations within a process faster? SIMD is one solution.
Benchmarking 32-bit Float vector dot-product calculations using Swift, NEON assembly, and Apple’s SIMD libraries, on Intel and M1 Macs.
It started with wanting to benchmark SSDs, passed through robust linear regression, and is now diving deep into linear programming and matrices.