Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2307.13854

a collection of algorithmic agents for user interfaces/interactions and program synthesis

about 7 hours ago

Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration

Paper • 1802.08802 • Published Feb 24, 2018
Mapping Natural Language Commands to Web Elements

Paper • 1808.09132 • Published Aug 28, 2018
Learning to Navigate the Web

Paper • 1812.09195 • Published Dec 21, 2018
Interactive Task and Concept Learning from Natural Language Instructions and GUI Demonstrations

Paper • 1909.00031 • Published Aug 30, 2019

Embodiment enables interaction of model with environment. Key is to anticipate what change could've come with its current action.

about 2 hours ago

Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation

Paper • 2408.11812 • Published 29 days ago • 4
WebArena: A Realistic Web Environment for Building Autonomous Agents

Paper • 2307.13854 • Published Jul 25, 2023 • 23
Agent Workflow Memory

Paper • 2409.07429 • Published 8 days ago • 25
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale

Paper • 2409.08264 • Published 7 days ago • 39

Running

124

🥇

BigCodeBench Leaderboard
Running

332

📢

UGI Leaderboard
Running

3.53k

🏆🤖

Chatbot Arena Leaderboard
Running on CPU Upgrade

3.74k

🥇

MTEB Leaderboard

More Agents Is All You Need

Paper • 2402.05120 • Published Feb 3 • 51
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

Paper • 2402.07456 • Published Feb 12 • 40
Generative Agents: Interactive Simulacra of Human Behavior

Paper • 2304.03442 • Published Apr 7, 2023 • 11
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models

Paper • 2310.04406 • Published Oct 6, 2023 • 8

about 9 hours ago

GAIA: a benchmark for General AI Assistants

Paper • 2311.12983 • Published Nov 21, 2023 • 182
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

Paper • 2311.16502 • Published Nov 27, 2023 • 35
BLINK: Multimodal Large Language Models Can See but Not Perceive

Paper • 2404.12390 • Published Apr 18 • 24
RULER: What's the Real Context Size of Your Long-Context Language Models?

Paper • 2404.06654 • Published Apr 9 • 33

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs