34 109 22

Aaditya Ura

aaditya

https://aadityaura.github.io

AI & ML interests

ML Researcher | Creator of OpenBioLLM, Open Medical-LLM Leaderboard, MedMCQA and Med-Halt | Focus on Representation Learning on Graphs and Manifolds | NLP x Healthcare

Articles

Advancing Open-source Large Language Models in the Medical & Healthcare Domain

May 10

• 5

The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare

Apr 19

• 99

Organizations

aaditya's activity

upvoted 56 papers 19 days ago

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16 • 96

TraDiffusion: Trajectory-Based Training-Free Image Generation

Paper • 2408.09739 • Published Aug 19 • 7

Authorship Attribution in the Era of LLMs: Problems, Methodologies, and Challenges

Paper • 2408.08946 • Published Aug 16 • 9

Factorized-Dreamer: Training A High-Quality Video Generator with Limited and Low-Quality Data

Paper • 2408.10119 • Published Aug 19 • 15

Predicting Rewards Alongside Tokens: Non-disruptive Parameter Insertion for Efficient Inference Intervention in Large Language Model

Paper • 2408.10764 • Published about 1 month ago • 7

MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding

Paper • 2408.11049 • Published about 1 month ago • 10

NeCo: Improving DINOv2's spatial representations in 19 GPU hours with Patch Neighbor Consistency

Paper • 2408.11054 • Published about 1 month ago • 10

MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning

Paper • 2408.11001 • Published about 1 month ago • 11

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published about 1 month ago • 54

TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17 • 51

Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification

Paper • 2408.11237 • Published about 1 month ago • 4

Backward-Compatible Aligned Representations via an Orthogonal Transformation Layer

Paper • 2408.08793 • Published Aug 16 • 4

FRAP: Faithful and Realistic Text-to-Image Generation with Adaptive Prompt Weighting

Paper • 2408.11706 • Published 30 days ago • 5

TrackGo: A Flexible and Efficient Method for Controllable Video Generation

Paper • 2408.11475 • Published 30 days ago • 16

GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models

Paper • 2408.11817 • Published 29 days ago • 7

FocusLLM: Scaling LLM's Context by Parallel Decoding

Paper • 2408.11745 • Published 30 days ago • 23

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published 30 days ago • 53

TWLV-I: Analysis and Insights from Holistic Evaluation on Video Foundation Models

Paper • 2408.11318 • Published about 1 month ago • 54

Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese

Paper • 2408.12480 • Published 29 days ago • 13

The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design

Paper • 2408.12503 • Published 29 days ago • 20

Jamba-1.5: Hybrid Transformer-Mamba Models at Scale

Paper • 2408.12570 • Published 29 days ago • 29

Hermes 3 Technical Report

Paper • 2408.11857 • Published Aug 15 • 34

Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published about 1 month ago • 48

HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models in Resource-Constrained Environments

Paper • 2408.10945 • Published about 1 month ago • 6

Memory-Efficient LLM Training with Online Subspace Descent

Paper • 2408.12857 • Published 28 days ago • 10

Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time

Paper • 2408.13233 • Published 28 days ago • 20

MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?

Paper • 2408.13257 • Published 27 days ago • 25

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published 29 days ago • 109

NanoFlow: Towards Optimal Large Language Model Serving Throughput

Paper • 2408.12757 • Published 28 days ago • 15

TVG: A Training-free Transition Video Generation Method with Diffusion Models

Paper • 2408.13413 • Published 27 days ago • 13

Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

Paper • 2408.13359 • Published 27 days ago • 21

LLaVaOLMoBitnet1B: Ternary LLM goes Multimodal!

Paper • 2408.13402 • Published 27 days ago • 17

Training-free Long Video Generation with Chain of Diffusion Model Experts

Paper • 2408.13423 • Published 27 days ago • 19

K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences

Paper • 2408.14468 • Published 24 days ago • 33

LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs

Paper • 2408.13467 • Published 27 days ago • 23

Foundation Models for Music: A Survey

Paper • 2408.14340 • Published 25 days ago • 38

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Paper • 2408.14176 • Published 25 days ago • 58

DSTI at LLMs4OL 2024 Task A: Intrinsic versus extrinsic knowledge for type classification

Paper • 2408.14236 • Published 25 days ago • 3

Text2SQL is Not Enough: Unifying AI and Databases with TAG

Paper • 2408.14717 • Published 24 days ago • 23

Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation

Paper • 2408.15239 • Published 23 days ago • 27

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published 23 days ago • 36

Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published 24 days ago • 119

Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts

Paper • 2408.15664 • Published 23 days ago • 11

Knowledge Navigator: LLM-guided Browsing Framework for Exploratory Search in Scientific Literature

Paper • 2408.15836 • Published 23 days ago • 11

Efficient LLM Scheduling by Learning to Rank

Paper • 2408.15792 • Published 23 days ago • 17

Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models

Paper • 2408.15915 • Published 23 days ago • 19

LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation

Paper • 2408.15881 • Published 23 days ago • 20

Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models

Paper • 2408.15518 • Published 23 days ago • 41

BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline

Paper • 2408.15079 • Published 24 days ago • 51

Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Paper • 2408.15998 • Published 22 days ago • 81

StyleRemix: Interpretable Authorship Obfuscation via Distillation and Perturbation of Style Elements

Paper • 2408.15666 • Published 23 days ago • 9

Law of Vision Representation in MLLMs

Paper • 2408.16357 • Published 22 days ago • 92

upvoted a paper 2 months ago

RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs

Paper • 2407.02485 • Published Jul 2 • 5

upvoted a collection 3 months ago

Life Science, Health and Medical Datasets for ML

Collection

A collection of datasets for Medical Domain • 4 items • Updated Jun 24 • 1

upvoted a paper 3 months ago

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20 • 85

upvoted a paper 5 months ago

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22 • 250