Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
davidberenstein1957
's Collections
Dataset Viber annotators
LLM evals and benchmark datasets
Useful Spaces
Cool and fun Spaces
Model Leaderboards
Useful models
Useful datasets
LLM evals and benchmark datasets
updated
Aug 17
Upvote
2
allenai/reward-bench
Viewer
•
Updated
10 days ago
•
8.11k
•
56.4k
•
67
openai/openai_humaneval
Viewer
•
Updated
Jan 4
•
164
•
8.33k
•
226
google/IFEval
Viewer
•
Updated
Aug 14
•
541
•
3.08k
•
29
allenai/ai2_arc
Viewer
•
Updated
Dec 21, 2023
•
7.79k
•
694k
•
128
allenai/winogrande
Updated
Jan 18
•
12.1k
•
53
TIGER-Lab/MMLU-Pro
Viewer
•
Updated
13 days ago
•
12.1k
•
154k
•
259
cais/mmlu
Viewer
•
Updated
Mar 8
•
231k
•
332k
•
300
truthfulqa/truthful_qa
Viewer
•
Updated
Jan 4
•
1.63k
•
6.23k
•
196
openai/gsm8k
Viewer
•
Updated
Jan 4
•
17.6k
•
44.9k
•
365
Rowan/hellaswag
Viewer
•
Updated
Sep 28, 2023
•
60k
•
14.8k
•
83
tatsu-lab/alpaca_eval
Updated
Aug 16
•
62.7k
•
48
HuggingFaceH4/mt_bench_prompts
Viewer
•
Updated
Jul 3, 2023
•
80
•
671
•
15
nvidia/ChatRAG-Bench
Viewer
•
Updated
May 24
•
34.6k
•
1.78k
•
94
rungalileo/ragbench
Viewer
•
Updated
Jun 11
•
95.4k
•
718
•
8
Upvote
2
Share collection
View history
Collection guide
Browse collections