Array Operations

AppleAccelerate wraps Apple's vecLib (vv*) and vDSP (vDSP_*) functions to provide accelerated element-wise operations on Array{Float32} and Array{Float64}.

These functions are not exported to avoid conflicts with Base. Access them via the AppleAccelerate. prefix.

Element-wise Math Functions

These functions wrap Apple's vecLib vv* routines.

One-argument functions

Each function f has an allocating variant f(X) and a mutating variant f!(out, X):

FunctionDescription
ceil, floor, trunc, roundRounding
sqrt, rsqrt, recSquare root, reciprocal square root, reciprocal
exp, exp2, expm1Exponentials
log, log1p, log2, log10Logarithms
sin, sinpi, cos, cospi, tan, tanpiTrigonometric
asin, acos, atanInverse trigonometric
sinh, cosh, tanh, asinh, acosh, atanhHyperbolic
abs, exponentMiscellaneous

Two-argument functions

FunctionDescription
copysign(X, Y)Copy sign of Y to X
rem(X, Y)Element-wise remainder
div_float(X, Y)Element-wise division (via vecLib)
atan(X, Y)Two-argument arctangent
pow(X, Y)Element-wise power

Special return types

FunctionDescription
sincos(X)Returns (sin(X), cos(X)) tuple
cis(X)Returns Complex array cos(X) + im*sin(X)
X = randn(Float64, 1000)

# Element-wise math — 3–19× faster than Base
Y_exp = AppleAccelerate.exp(X)
Y_sin = AppleAccelerate.sin(X)
Y_log = AppleAccelerate.log(X .+ 10)  # shift to positive domain

# Mutating variant (pre-allocate output)
out = similar(X)
AppleAccelerate.exp!(out, X)

# Broadcasting works automatically
Y_broadcast = AppleAccelerate.sin.(X)
AppleAccelerate.sincosFunction
sincos(X::Array{T}) where T <: Union{Float32, Float64}

Compute the sine and cosine of each element simultaneously via vecLib vvsincos. Returns a tuple (sin(X), cos(X)) of arrays. Faster than computing sin and cos separately since both are produced in a single pass.

The mutating variant sincos!(out_sin, out_cos, X) stores results in preallocated arrays.

source
AppleAccelerate.cisFunction
cis(X::Array{T}) where T <: Union{Float32, Float64}

Compute cos(x) + im*sin(x) for each element via vecLib vvcosisin. Returns a Complex{T} array. Equivalent to exp.(im .* X) but faster.

The mutating variant cis!(out, X) stores results in a preallocated complex array.

source

Unary vDSP Operations

Wraps vDSP unary vector operations.

FunctionDescription
vnegNegate each element: result[i] = -X[i]
vnabsNegative absolute value: `result[i] = -
vabsAbsolute value: result[i] = |X[i]|
vsqSquare each element: result[i] = X[i]^2
vssqSigned square: result[i] = X[i] * |X[i]|
vfracFractional part: result[i] = X[i] - trunc(X[i])
vreverse!Reverse vector in-place
vreverseReturn a reversed copy

Vector Reductions

Wraps vDSP reduction functions.

FunctionDescriptionApple function
maximum(X), minimum(X)Max/min valuevDSP_maxv, vDSP_minv
findmax(X), findmin(X)Max/min value and indexvDSP_maxvi, vDSP_minvi
sum(X), mean(X)Sum and meanvDSP_sve, vDSP_meanv
meanmag(X)Mean of absolute valuesvDSP_meamgv
meansqr(X)Mean of squaresvDSP_measqv
meanssqr(X)Mean of signed squaresvDSP_mvessq
summag(X)Sum of absolute valuesvDSP_svemg
sumsqr(X)Sum of squaresvDSP_svesq
sumssqr(X)Sum of signed squaresvDSP_svs
dotDot product: sum(X .* Y)vDSP_dotpr
distancesqSquared Euclidean distance: sum((X .- Y).^2)vDSP_distancesq
rmsqvRoot mean square: sqrt(sum(X.^2)/N)
sve_svesqSimultaneous sum and sum-of-squares
maxmgvMaximum magnitude: max(|X|)
minmgvMinimum magnitude: min(|X|)
maxmgviMaximum magnitude with index
minmgviMinimum magnitude with index
X = randn(Float64, 10_000)

# Reductions
s = AppleAccelerate.sum(X)
mx = AppleAccelerate.maximum(X)
val, idx = AppleAccelerate.findmax(X)
avg = AppleAccelerate.mean(X)
AppleAccelerate.sumFunction
sum(X::Vector{T}) where T <: Union{Float32, Float64}

Return the sum of elements in X via vDSP. Equivalent to Base.sum(X). Wraps vDSP_sve.

source
AppleAccelerate.findmaxFunction
findmax(X::Vector{T}) where T <: Union{Float32, Float64}

Return (value, index) of the maximum element in X via vDSP. Equivalent to Base.findmax(X). Wraps vDSP_maxvi.

source
AppleAccelerate.findminFunction
findmin(X::Vector{T}) where T <: Union{Float32, Float64}

Return (value, index) of the minimum element in X via vDSP. Equivalent to Base.findmin(X). Wraps vDSP_minvi.

source

Vector-Vector Arithmetic

FunctionDescriptionApple function
vadd / vadd!Element-wise additionvDSP_vadd
vsub / vsub!Element-wise subtractionvDSP_vsub
vmul / vmul!Element-wise multiplicationvDSP_vmul
vdiv / vdiv!Element-wise divisionvDSP_vdiv
A = randn(Float64, 1000)
B = randn(Float64, 1000)

# Vector arithmetic
C = AppleAccelerate.vadd(A, B)   # A .+ B
D = AppleAccelerate.vmul(A, B)   # A .* B

# Compound operation: A * scalar + B
E = AppleAccelerate.vsma(A, 2.5, B)  # A .* 2.5 .+ B
AppleAccelerate.vaddFunction

vadd(X::Vector{Float32}, Y::Vector{Float32})

Implements element-wise addition over two Vector{Float32}. Allocates memory to store result. Returns: Vector{Float32}

source

vadd(X::Vector{Float64}, Y::Vector{Float64})

Implements element-wise addition over two Vector{Float64}. Allocates memory to store result. Returns: Vector{Float64}

source
AppleAccelerate.vadd!Function

vadd!(result::Vector{Float32}, X::Vector{Float32}, Y::Vector{Float32})

Implements element-wise addition over two Vector{Float32} and overwrites the result vector with computed value. Returns: Vector{Float32} result

source

vadd!(result::Vector{Float64}, X::Vector{Float64}, Y::Vector{Float64})

Implements element-wise addition over two Vector{Float64} and overwrites the result vector with computed value. Returns: Vector{Float64} result

source
AppleAccelerate.vsubFunction

vsub(X::Vector{Float32}, Y::Vector{Float32})

Implements element-wise subtraction over two Vector{Float32}. Allocates memory to store result. Returns: Vector{Float32}

source

vsub(X::Vector{Float64}, Y::Vector{Float64})

Implements element-wise subtraction over two Vector{Float64}. Allocates memory to store result. Returns: Vector{Float64}

source
AppleAccelerate.vsub!Function

vsub!(result::Vector{Float32}, X::Vector{Float32}, Y::Vector{Float32})

Implements element-wise subtraction over two Vector{Float32} and overwrites the result vector with computed value. Returns: Vector{Float32} result

source

vsub!(result::Vector{Float64}, X::Vector{Float64}, Y::Vector{Float64})

Implements element-wise subtraction over two Vector{Float64} and overwrites the result vector with computed value. Returns: Vector{Float64} result

source
AppleAccelerate.vmulFunction

vmul(X::Vector{Float32}, Y::Vector{Float32})

Implements element-wise multiplication over two Vector{Float32}. Allocates memory to store result. Returns: Vector{Float32}

source

vmul(X::Vector{Float64}, Y::Vector{Float64})

Implements element-wise multiplication over two Vector{Float64}. Allocates memory to store result. Returns: Vector{Float64}

source
AppleAccelerate.vmul!Function

vmul!(result::Vector{Float32}, X::Vector{Float32}, Y::Vector{Float32})

Implements element-wise multiplication over two Vector{Float32} and overwrites the result vector with computed value. Returns: Vector{Float32} result

source

vmul!(result::Vector{Float64}, X::Vector{Float64}, Y::Vector{Float64})

Implements element-wise multiplication over two Vector{Float64} and overwrites the result vector with computed value. Returns: Vector{Float64} result

source
AppleAccelerate.vdivFunction

vdiv(X::Vector{Float32}, Y::Vector{Float32})

Implements element-wise division over two Vector{Float32}. Allocates memory to store result. Returns: Vector{Float32}

source

vdiv(X::Vector{Float64}, Y::Vector{Float64})

Implements element-wise division over two Vector{Float64}. Allocates memory to store result. Returns: Vector{Float64}

source
AppleAccelerate.vdiv!Function

vdiv!(result::Vector{Float32}, X::Vector{Float32}, Y::Vector{Float32})

Implements element-wise division over two Vector{Float32} and overwrites the result vector with computed value. Returns: Vector{Float32} result

source

vdiv!(result::Vector{Float64}, X::Vector{Float64}, Y::Vector{Float64})

Implements element-wise division over two Vector{Float64} and overwrites the result vector with computed value. Returns: Vector{Float64} result

source

Two-Vector Comparison & Distance

FunctionDescription
vmaxElement-wise maximum
vminElement-wise minimum
vmaxmgElement-wise maximum magnitude
vminmgElement-wise minimum magnitude
vdistElement-wise Euclidean distance
vtmergTapered merge of two vectors

Vector-Scalar Operations

FunctionDescriptionApple function
vsadd / vsadd!Vector + scalarvDSP_vsadd
vssub / vssub!Vector - scalarvDSP_vsadd
svsub / svsub!Scalar - vectorvDSP_vsadd
vsmul / vsmul!Vector * scalarvDSP_vsmul
vsdiv / vsdiv!Vector / scalarvDSP_vsdiv
svdivScalar / vectorvDSP_svdiv
AppleAccelerate.vsaddFunction

vsadd(X::Vector{Float32}, c::Float32)

Implements vector-scalar addition over Vector{Float32} and Float32. Allocates memory to store result. Returns: Vector{Float32}

source

vsadd(X::Vector{Float64}, c::Float64)

Implements vector-scalar addition over Vector{Float64} and Float64. Allocates memory to store result. Returns: Vector{Float64}

source
AppleAccelerate.vsadd!Function

vsadd!(result::Vector{Float32}, X::Vector{Float32}, c::Float32)

Implements vector-scalar addition over Vector{Float32} and Float32 and overwrites the result vector with computed value. Returns: Vector{Float32} result

source

vsadd!(result::Vector{Float64}, X::Vector{Float64}, c::Float64)

Implements vector-scalar addition over Vector{Float64} and Float64 and overwrites the result vector with computed value. Returns: Vector{Float64} result

source
AppleAccelerate.vssubFunction

vssub(X::Vector{Float32}, c::Float32)

Implements vector-scalar subtraction over Vector{Float32} and Float32. Allocates memory to store result. Returns: Vector{Float32}

source

vssub(X::Vector{Float64}, c::Float64)

Implements vector-scalar subtraction over Vector{Float64} and Float64. Allocates memory to store result. Returns: Vector{Float64}

source
AppleAccelerate.vssub!Function

vssub!(result::Vector{Float32}, X::Vector{Float32}, c::Float32)

Implements vector-scalar subtraction over Vector{Float32} and Float32 and overwrites the result vector with computed value. Returns: Vector{Float32} result

source

vssub!(result::Vector{Float64}, X::Vector{Float64}, c::Float64)

Implements vector-scalar subtraction over Vector{Float64} and Float64 and overwrites the result vector with computed value. Returns: Vector{Float64} result

source
AppleAccelerate.svsubFunction

svsub(X::Vector{Float32, c::Float32})

Implements vector-scalar subtraction over Float32 and Vector{Float32}. Allocates memory to store result. Returns: Vector{Float32}

source

svsub(X::Vector{Float64, c::Float64})

Implements vector-scalar subtraction over Float64 and Vector{Float64}. Allocates memory to store result. Returns: Vector{Float64}

source
AppleAccelerate.svsub!Function

svsub!(result::Vector{Float32}, X::Vector{Float32}, c::Float32)

Implements vector-scalar subtraction over Float32 and Vector{Float32} and overwrites the result vector with computed value. Returns: Vector{Float32} result

source

svsub!(result::Vector{Float64}, X::Vector{Float64}, c::Float64)

Implements vector-scalar subtraction over Float64 and Vector{Float64} and overwrites the result vector with computed value. Returns: Vector{Float64} result

source
AppleAccelerate.vsmulFunction

vsmul(X::Vector{Float32}, c::Float32)

Implements vector-scalar multiplication over Vector{Float32} and Float32. Allocates memory to store result. Returns: Vector{Float32}

source

vsmul(X::Vector{Float64}, c::Float64)

Implements vector-scalar multiplication over Vector{Float64} and Float64. Allocates memory to store result. Returns: Vector{Float64}

source
AppleAccelerate.vsmul!Function

vsmul!(result::Vector{Float32}, X::Vector{Float32}, c::Float32)

Implements vector-scalar multiplication over Vector{Float32} and Float32 and overwrites the result vector with computed value. Returns: Vector{Float32} result

source

vsmul!(result::Vector{Float64}, X::Vector{Float64}, c::Float64)

Implements vector-scalar multiplication over Vector{Float64} and Float64 and overwrites the result vector with computed value. Returns: Vector{Float64} result

source
AppleAccelerate.vsdivFunction

vsdiv(X::Vector{Float32}, c::Float32)

Implements vector-scalar division over Vector{Float32} and Float32. Allocates memory to store result. Returns: Vector{Float32}

source

vsdiv(X::Vector{Float64}, c::Float64)

Implements vector-scalar division over Vector{Float64} and Float64. Allocates memory to store result. Returns: Vector{Float64}

source
AppleAccelerate.vsdiv!Function

vsdiv!(result::Vector{Float32}, X::Vector{Float32}, c::Float32)

Implements vector-scalar division over Vector{Float32} and Float32 and overwrites the result vector with computed value. Returns: Vector{Float32} result

source

vsdiv!(result::Vector{Float64}, X::Vector{Float64}, c::Float64)

Implements vector-scalar division over Vector{Float64} and Float64 and overwrites the result vector with computed value. Returns: Vector{Float64} result

source

Compound Arithmetic

These operations fuse multiple arithmetic steps into a single vDSP call for better performance.

Three-vector operations

FunctionDescription
vam(A + B) * C
vsbm(A - B) * C
vmaA * B + C
vmsbA * B - C
venvlpSignal envelope

Four-vector operations

FunctionDescription
vaam(A + B) * (C + D)
vsbsbm(A - B) * (C - D)
vasbm(A + B) * (C - D)
vmmaA * B + C * D
vmmsbA * B - C * D
vpythgPythagorean distance

Vector-vector-scalar operations

FunctionDescription
vasm(A + B) * c
vsbsm(A - B) * c
vsmaA * b + C
vsmsaA * b + c
vmsaA * B + c
vsmsbA * b - C
vsmsmaA * b + C * d

Dual output

FunctionDescription
vaddsubSimultaneous add and subtract: returns (A .+ B, A .- B)

Clipping & Thresholding

FunctionDescription
vclipClip values to [low, high]
vclipcClip with count: returns (clipped, nlow, nhigh)
viclipInverted clip: pass values outside [low, high]
vthrThreshold: keep or clamp to threshold
vthresThreshold to zero
vlimTest limit: (b <= A[i]) ? c : -c
vthrscThreshold with signed constant
vcmprsCompress: gather elements where gate is nonzero

Type Conversion

FunctionDescription
vdoubleConvert Float32 to Float64
vsingleConvert Float64 to Float32

Ramp Generation

FunctionDescription
vrampGenerate a ramp: start + i * step
vrampmulMultiply vector by a generated ramp
vrampmul2Stereo ramp multiply (two outputs)

Linear Average

FunctionDescription
vavlinWeighted linear average of two vectors

Integration & Running Operations

FunctionDescription
vrsumRunning sum scaled by scale
vsimpsSimpson's rule integration
vtrapzTrapezoidal integration
vswsumSliding window sum
vswmaxSliding window maximum

Interpolation

FunctionDescription
vintbLinear interpolation: A + t * (B - A)
vlintLinear interpolation from lookup table
vqintQuadratic interpolation from lookup table

Polynomial Evaluation

FunctionDescription
vpolyEvaluate polynomial at each point

Normalization

FunctionDescription
vnormalizeNormalize to zero mean and unit standard deviation
AppleAccelerate.vnormalizeFunction
vnormalize(X) -> (normalized, mean, stddev)

Normalize vector to zero mean and unit standard deviation: (X .- mean) ./ stddev. Returns a tuple of (normalized_vector, mean, stddev). Wraps vDSP_normalize.

source

Zero Crossings

FunctionDescription
nzcrosFind zero crossings
AppleAccelerate.nzcrosFunction
nzcros(X, max_crossings=0) -> (indices, count)

Find zero crossings in X. Returns a tuple of (crossing_indices, count). If max_crossings <= 0, searches for up to length(X) crossings. Wraps vDSP_nzcros.

source

Decibel Conversion

FunctionDescription
vdbconConvert to decibels relative to a reference
AppleAccelerate.vdbconFunction
vdbcon(X, ref, power=true)

Convert to decibels relative to ref. If power=true, computes 10*log10(X/ref); if power=false, computes 20*log10(X/ref). Wraps vDSP_vdbcon.

source

Vector Fill, Swap & Sort

FunctionDescription
vclr!Fill vector with zeros
vfill!Fill vector with scalar value
vswap!Swap two vectors in-place
vsort!Sort vector in-place
vsortiReturn sort permutation (indices)

Gathering & Indexing

FunctionDescription
vgathrGather by index: C[i] = A[B[i]]
vindexIndex with float indices
vgenGenerate linear ramp between two values
vgenpPiecewise linear interpolation from breakpoints
vtabiTable lookup with interpolation

Matrix Operations

FunctionDescription
mmulMatrix multiply: C = A * B
mtransMatrix transpose: C = Aᵀ
mmovMatrix copy (submatrix move)

Integer Operations (Int32)

FunctionDescription
vaddiInt32 vector addition
vabsiInt32 absolute value
vfilli!Fill Int32 vector with scalar
veqviInt32 bitwise XNOR

Type Conversion (int ↔ float)

DirectionFunctionsDescription
float → signed int (truncate)vfix8, vfix16, vfix32Truncating conversion
float → unsigned int (truncate)vfixu8, vfixu16, vfixu32Truncating conversion
float → signed int (round)vfixr8, vfixr16, vfixr32Rounding conversion
float → unsigned int (round)vfixru8, vfixru16, vfixru32Rounding conversion
signed int → floatvflt8, vflt16, vflt32Signed integer to float
unsigned int → floatvfltu8, vfltu16, vfltu32Unsigned integer to float

Image Convolution

FunctionDescription
f3x32D convolution with 3×3 filter
f5x52D convolution with 5×5 filter
imgfirGeneral 2D image convolution

Format Conversion

FunctionDescription
ctozInterleaved complex → split (real, imag) vectors
ztocSplit (real, imag) vectors → interleaved complex

Complex Array Operations

AppleAccelerate also wraps vDSP's split-complex functions for Vector{Complex{Float32}} and Vector{Complex{Float64}}. These extend existing function names (e.g., vneg, vabs, vmul) with methods that dispatch on complex element types — no naming conflicts with the real-valued versions above.

Complex unary operations

FunctionDescription
vneg(X) / vneg!(result, X)Negate: -X
vabs(X) / vabs!(result, X)Modulus: abs.(X)
vconjComplex conjugate
vcopyCopy via split-complex move

Complex → real operations

FunctionDescription
vphaseComplex phase (angle)
vmagsSquared magnitude (abs2)
vmagsaSquared magnitude + accumulate

Complex binary operations

FunctionDescription
vmul(X, Y) / vmul!(result, X, Y)Element-wise multiply: X .* Y
vdiv(X, Y) / vdiv!(result, X, Y)Element-wise divide: X ./ Y
vsmul(X, c) / vsmul!(result, X, c)Scalar multiply (complex scalar)
dot(X, Y)Unconjugated dot product: sum(X .* Y)
zvaddComplex addition: A + B
zvsubComplex subtraction: A - B
zvcmulConjugate multiply: conj(A) * B

Complex-real operations

FunctionDescription
zrvmulComplex × real
zrvdivComplex / real
zrvaddComplex + real (adds to real part)
zrvsubComplex − real

Complex compound operations

FunctionDescription
zvcmaconj(A)*B + C
zvmaA*B + C
zvsmaA*b + C (b is complex scalar)

Complex dot products

FunctionDescription
zidotprConjugate dot: sum(conj(A) .* B)
zrdotprComplex-real dot: sum(A .* B)

Complex fill & convolution

FunctionDescription
zvfill!Fill complex vector with scalar
zconvComplex convolution
zmmulComplex matrix multiply

Coordinate conversion

FunctionDescription
polarCartesian to polar coordinates
rectPolar to Cartesian coordinates
Z = AppleAccelerate.cis(randn(Float64, 100))  # complex array

# Complex operations
conj_Z = AppleAccelerate.vconj(Z)
phases = AppleAccelerate.vphase(Z)
mags = AppleAccelerate.vmags(Z)

# Coordinate conversion
r, θ = AppleAccelerate.polar(Z)
Z_back = AppleAccelerate.rect(r, θ)
AppleAccelerate.vmagsFunction
vmags(X::Vector{Complex{T}}) -> Vector{T}
vmags!(result, X)

Squared magnitude: result[i] = abs2(X[i]) = real(X[i])^2 + imag(X[i])^2. Wraps vDSP_zvmags.

source
AppleAccelerate.vmagsaFunction
vmagsa(X::Vector{Complex{T}}, B::Vector{T}) -> Vector{T}
vmagsa!(result, X, B)

Squared magnitude and accumulate: result[i] = abs2(X[i]) + B[i]. Wraps vDSP_zvmgsa.

source
AppleAccelerate.rectFunction
rect(magnitudes::Vector{T}, angles::Vector{T}) -> Vector{Complex{T}}

Convert polar coordinates to complex Cartesian form. Wraps vDSP_rect.

source

Broadcasting

AppleAccelerate overrides Base.copy and Base.copyto! for Broadcasted objects, so that broadcasting syntax like f.(X) automatically uses the accelerated implementation.