Hugging Quants

AI & ML interests

Optimised quants for high-throughput deployments! Compatible with Transformers, TGI & vLLM 🤗

Organization Card

Community About org cards

Welcome to the home of exciting quantized models! We'd love to see increased adoption of powerful state-of-the-art open models, and quantization is a key component to make them work on more types of hardware.

Resources:

Llama 3.1 Quantized Models: Optimised Quants of Llama 3.1 for high-throughput deployments! Compatible with Transformers, TGI & VLLM 🤗.
Hugging Face Llama Recipes: A set of minimal recipes to get started with Llama 3.1.

Collections 1

models 13

hugging-quants/Meta-Llama-3.1-405B-BNB-NF4

Text Generation • Updated 3 days ago • 46 • 2

hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4

Text Generation • Updated 3 days ago • 1.45k • 5

hugging-quants/Meta-Llama-3.1-405B-BNB-NF4-BF16

Text Generation • Updated 3 days ago • 2.58k • 2

hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4

Text Generation • Updated 7 days ago • 38.9k • 33

hugging-quants/Meta-Llama-3.1-8B-BNB-NF4

Text Generation • Updated Aug 8 • 731

hugging-quants/Meta-Llama-3.1-70B-BNB-NF4-BF16

Text Generation • Updated Aug 8 • 690 • 2

hugging-quants/Meta-Llama-3.1-8B-Instruct-BNB-NF4

Text Generation • Updated Aug 8 • 1.66k • 7

hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4

Text Generation • Updated Aug 7 • 324k • 52

hugging-quants/Meta-Llama-3.1-70B-Instruct-GPTQ-INT4

Text Generation • Updated Aug 7 • 17.2k • 13

hugging-quants/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4

Text Generation • Updated Aug 7 • 1.61k • 15

datasets

None public yet