Write the Bellman equation for reinforcement learning

Bellman equation: V(s) = max_a [R(s,a) + γ Σ P(s'|s,a) V(s')]

Related concepts

Adam optimizer weight update with m and v terms

Adam optimizer weight update: w_t = w_{t-1} - α * m_t / (sqrt(v_t) + ε)

self-attention: Attention(Q,K,V) = softmax(QK^T/√d_k)V

Attention(Q,K,V) = softmax(QK^T/√d_k)V

Stochastic gradient descent

Policy Gradient Theorem Equation

ReLU and Leaky ReLU

ReLU: f(x) = max(0, x); Leaky ReLU: f(x) = x if x > 0 else αx (α < 1)

Write the contrastive loss function for SimCLR

Contrastive loss function: L = (1/2N) Σ [max(0, margin - y_i * (z_i - z_j))^2 + max(0, y_i * (z_i - z_j) - margin)^2]

Gradient descent

Gradient descent weight update equation: w := w - α * ∇J(w)

Swipe through 100 ML concepts daily