LoRA uses r << d for efficient adaptation
Image: Bin im Garten, CC BY-SA 3.0, via Wikimedia Commons
LoRA uses r << d for efficient adaptation
LoRA leverages low-rank adaptation (r << d) to fine-tune pre-trained models efficiently. This approach reduces the number of parameters that need to be updated, making it computationally less intensive.
Example
In a 768-dimensional weight matrix, LoRA might only adapt a 64-dimensional rank (r << d), significantly reducing the number of parameters compared to full fine-tuning.
This efficiency matters because it allows for quicker adaptation and lower resource consumption.
2024 in hip-hop
LoRA rank r controls model capacity and parameters
Alex Lora Cercos
Alex Lora is a Spanish film director
UMAP is faster than t-SNE
UMAP is faster due to approximate nearest neighbors and cross-entropy optimization
LoRA vs full fine-tuning: LoRA trains rank-r adapters (~0.1% params), full FT updates everything
LoRA trains rank-r adapters (~0.1% params), full FT updates everything
t-SNE preserves local structure
t-SNE preserves local structure by converting distances to probabilities and minimizing Kullback-Leibler divergence
Overlapping subproblems
Dynamic programming solves overlapping subproblems by storing results of subproblems to avoid redundant calculations
One email a day: 5 concepts + the 5 stories that matter →
Swipe through 100 ML concepts daily
Open TickerNews