Octavian

Octavian.jl is a multi-threaded BLAS-like library that provides pure Julia matrix multiplication on the CPU, built on top of LoopVectorization.jl.

The source code for Octavian is available in the GitHub repository.

Julia PackageCPUGPU
Gaius.jlYesNo
GemmKernels.jlNoYes
Octavian.jlYesNo
Tullio.jlYesYes

In general:

  • Octavian has the fastest CPU performance.
  • GemmKernels has the fastest GPU performance.
  • Tullio is the most flexible.
Note

Octavian's tasks can interfere with tasks spawned by Base.Threads, resulting in much slower execution times when used together. This can be avoided by using threading utilities from Polyester or LoopVectorization instead. See this Discourse post for more information.