Generic-Strided Performance
Column-major matrix-matrix products (usual case) are handled with BLAS libraries in LinearAlgebra
. Common options here include:
- OpenBLAS (default)
- MKL (MKL.jl)
Comparison of BLIS against other popular BLAS vendors is available in this section of BLIS documents. This short article, instead, lists performance benchmarks of generic-strided matrix-matrix multiplications which is not supported by BLAS thus handled with plain loops in LinearAlgebra
. Strided.jl which provides sophisticated tailoring of all kinds of strided arrays is also included in this test.
Linux, AVX512
- OS: CentOS 7
- Processor: Intel(R) Xeon(R) Platinum 8260
- FP Pipeline: 2 AVX512 pipelines
- OpenMP Thread Used: 4
macOS, AVX2
- OS: macOS 10.15.7
- Processor: Intel(R) Core(R) i5 8259U
- FP Pipeline: AVX2 pipelines
- OpenMP Thread Used: 4