convolution (f * g)(t) = ∫f(τ)g(t-τ)dτ

(f * g)(t) = ∫f(τ)g(t-τ)dτ

Related concepts

Lagrangian L(x,λ) = f(x) - λg(x)

L(x,λ) = f(x) - λ(g(x) - c)

ReLU and Leaky ReLU

ReLU: f(x) = max(0, x); Leaky ReLU: f(x) = x if x > 0 else αx (α < 1)

Write the multi-head attention formula: MultiHead(Q,K,V) = Concat(head_1,...,head_h)W^O

MultiHead(Q,K,V) = Concat(head_i=MultiHeadAttention(Q,K,V)_i)W^O

Normalization (machine learning)

L2 normalization equation: x_i' = x_i / ||x||_2

Batch normalization

Batch normalization formula: Y = (X - μ) / σ * γ + β

Cosine similarity

Cosine similarity formula: cos(θ) = (A · B) / (||A|| ||B||)

Swipe through 100 ML concepts daily