Welcome FalconMamba: The first strong attention-free 7B model
•
96
Exciting release !
Hi
@Jenish-23
For running AWQ models using HF transformers, please refer to this documentation section: https://huggingface.co./docs/transformers/quantization#awq
For using AWQ + Lora you just need to load a AWQ base model using HF transformers and apply LoRA as usual with no code changes. Make sure to install transformers from source for that
Hmm interesting, can you try to generate some text with sampling methods?
Hi !
I think for NEFTune it should be supported out of the box as you just need to pass the correct argument neftune_noise_alpha
in TrainingArguments
right?
Great work !
Very nice demo !!