Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients Paper • 2407.08296 • Published Jul 11 • 31
INT4 LLMs for vLLM Collection Accurate INT4 quantized models by Neural Magic, ready for use with vLLM! • 18 items • Updated 21 days ago • 5
FP8 LLMs for vLLM Collection Accurate FP8 quantized models by Neural Magic, ready for use with vLLM! • 37 items • Updated 24 days ago • 51
Research projects on top of vLLM Collection Papers cited in https://blog.vllm.ai/2024/07/25/lfai-perf.html • 6 items • Updated Jul 29 • 12
DPO datasets for DE Collection A collection of DPO datasets for the DE language. • 6 items • Updated Apr 15 • 1
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method Paper • 2402.17193 • Published Feb 27 • 23
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens Paper • 2401.17377 • Published Jan 30 • 34
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling Paper • 2311.00430 • Published Nov 1, 2023 • 56
ModuleFormer: Learning Modular Large Language Models From Uncurated Data Paper • 2306.04640 • Published Jun 7, 2023 • 7
Retentive Network: A Successor to Transformer for Large Language Models Paper • 2307.08621 • Published Jul 17, 2023 • 170
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis Paper • 2307.01952 • Published Jul 4, 2023 • 80
TART: A plug-and-play Transformer module for task-agnostic reasoning Paper • 2306.07536 • Published Jun 13, 2023 • 11