GPU memory requirements rules of thumb
Long overdue, will update this page as I find better options.
Tutorials
- Transformer Math 101 | EleutherAI Blog AWESOME â TODO
Links
- LLaMA 7B GPU Memory Requirement - đ¤Transformers - Hugging Face Forums
- 7B full precision => $7*4=28gb$ of GPU RAM
- quantization:
torch_dtype=torch.float16
etc. to use half the memory - this is for inference, training requires a bit more.
- quantization:
- Why
*4
? storing weights+gradient, better explanation at that link. - based on optimizer, might be
*8
etc.
- 7B full precision => $7*4=28gb$ of GPU RAM
- HF: GPU, more about training but less about mental models
Calculators
- Model Memory Utility - a Hugging Face Space by hf-accelerate
- manually select ’transformers'
- results for llama-2.3-1B are 4.6GB total size, 18gb for training
- doesn’t match info above, but matches e.g. Hardware requirements for Llama 3.2 3B with full context 128k? : r/LocalLLaMA
- LLM Model VRAM Calculator - a Hugging Face Space by NyxKrage
Nel mezzo del deserto posso dire tutto quello che voglio.
comments powered by Disqus