Watch
TL;DR: Training LLMs to reason with just 1K training samples & a simple technique to control reasoning duration called "budget forcing".
I research LLMs as a PhD student at Stanford 😊 Previously, at Peking University, Hugging Face, Meta.
Watch
TL;DR: Training LLMs to reason with just 1K training samples & a simple technique to control reasoning duration called "budget forcing".
Watch
TL;DR: How to scale LLMs when data is scarce & predict performance ("scaling laws"). First to train LLMs across 1000s of AMD GPUs.
Watch
TL;DR: The standard for evaluating image/audio/text embeddings. Used by OpenAI, Google, Meta with 6M+ total installs.
Watch
TL;DR: State-of-the-art fully open sparse language models. Many training ablations & analysis on routing behavior.
Watch
TL;DR: Explores how LLMs generalize across languages & released state-of-the-art open LLMs at the time.
Watch
TL;DR: The first LLM that yields state-of-the-art performance on both generative & embedding tasks. Can speed up RAG by >60%.
Watch
TL;DR: Strong code LLMs trained via largest dataset of git commits (OctoCoder & CommitPack). Also built HumanEvalPack for evaluation.
Blog
TL;DR: Built Vision-Language Models that finished 2nd/3300+ in Meta's $100K Hateful Memes Competition.
(scroll → for more)
Questions on papers I’ve co-authored: Please open a GitHub issue on its code repository :)
Starting AI Research: If you want to get started in research I recommend contributing to MTEB. We’re a community building the go-to place for everything embeddings with 200K active monthly users on our leaderboard & regular publications you can co-author! Example papers from our community: MMTEB, MIEB, HUME, SEB.
My email is [email protected] :)
Health: I'm pretty into health optimization; my fav sports are swimming/beachvb/tennis :)
Languages: Worked in Japanese/Chinese/English/German/French in the past; even went all the way to do a JLPTN1 😅
Arts: As a kid I worked as a voice-over artist for 8 years dubbing German voices for Peter Pan (Disney), Pokemon, Game Of Thrones (HBO), Dracula (NBC) & others (sample: Gortimer here & Victor here) 😁