2024 in hip-hop

LoRA rank r controls model capacity and parameters

In LoRA, the rank r determines the model's capacity and the number of parameters. A higher rank results in a model with greater capacity and more parameters.

Example

A LoRA model with rank r=32 will have more capacity and parameters compared to a model with rank r=16.

Understanding the relationship between rank r and model capacity is crucial for optimizing performance in LoRA models.

Related concepts

LoRA (machine learning)

LoRA uses r << d for efficient adaptation

Alex Lora Cercos

Alex Lora is a Spanish film director

LoRA vs full fine-tuning: LoRA trains rank-r adapters (~0.1% params), full FT updates everything

LoRA trains rank-r adapters (~0.1% params), full FT updates everything

MoE models have more parameters but similar compute cost

MoE models distribute parameters across k experts, reducing active experts' compute cost

Neural scaling law

Chinchilla scaling law: optimal model size scales linearly with compute budget

the vocabulary size matters: larger vocab = shorter sequences but more parameters

Larger vocab reduces sequence length, increasing model complexity and parameters

One email a day: 5 concepts + the 5 stories that matter →

Swipe through 100 ML concepts daily

Open TickerNews