Introducing Idefics2: A Powerful 8B Vision-Language Model for the community
•
160
(took me the entire weekend to think about this pun)
BLOOOM*
hey
@RonanMcGovern
if i trust the reported numbers, Idefics2 performs better than LLaVA-Llama-3-8B-v1.1 on the majority of benchmarks.
Something I am very excited about with synthetic data is the increased ability to tune the data so that they look like what you want them to look like.
We typically spend a lot of time filtering web-scale data by building heuristics that detect "poor-quality" samples. With control over the data creation process, you can quickly tune the generation process to give some specific properties to the data. Often it's just about telling your model to do X, and not to do Y.