AI at Meta’s Post

View organization page for AI at Meta, graphic

818,723 followers

In April, we published a research paper on a new approach for building better and faster LLMs by using multi-token prediction. Using this approach, we can train language models to predict multiple future words at once, improving model capabilities and training efficiency while allowing for faster inference. In the spirit of responsible open science, we’ve released pre-trained models for code completion using this approach to enable further exploration in the research community. Get the model on Hugging Face ➡️ https://go.fb.me/dm1giu More on this approach ➡️ https://go.fb.me/x1zhdq

44 Comments

Mariusz Nitecki

LLM Expert & Data Scientist Specializing in Advanced LLM Applications, LLM Implementations and Scalable Data Solutions

I'm curious if their multi-token model not only outperforms their own baseline but also the top models of a similar size. It works well for generative tasks, but the paper indicates mixed results on multiple-choice question benchmarks. Also see https://arxiv.org/abs/2401.10774

9 Reactions

Jaffar Ali

CEO & Founder at Databiqs | Expert in AI, Blockchain, and Web Development | Innovating Future Technologies

Multi-token prediction shows promise for improving efficiency and performance in language models, but managing complexity and resource demands, as well as ensuring consistent performance across varied datasets, may hinder widespread adoption.

Arkestrateon

Maintaining accuracy and efficiency is a 'precisive' technique, follows 'variable' methodology with 'angular' technology. Focus times on Complexity (Agile), Dependency (mitigates), Intensity (Resource management), ETL Data Processing (WLB) with relevance. Coordinative Management is effective tool with CI along with rule infused baseline targeting.

2 Reactions

Vincent Granville

Chief AI Scientist, GenAItechLab.com

You are not the first one to use multi-tokens; I started earlier than April. I also use contextual tokens. See https://mltblog.com/4aHYM4i

39 Reactions

COSMOS AI Research Group, YTU CE

Wow, are we witnessing another "Attention is all you need" moment?

9 Reactions

Praudyogic AI

An interesting approach. Keen to play around with it!

2 Reactions

Dr. Timo Reckling

Software Consultant at TNG Technology Consulting GmbH at TNG Technology Consulting

(Disclaimer: I haven't read the paper, yet.) Probably a provocative question: Any thoughts on why the paper was published in April and the model only released now?

2 Reactions

Antony Konnoth

Customer-focused IT Director | Digital Transformation Leader | Strategic Technology Innovator

17h

It will be interesting to compare and contrast how this stands up against contextual tokens

Callmentor

Excellent work! Exciting news!

1 Reaction

IntellyLabs Technologies

This is unique. Collaboration with a global AI community is more important than ever 🙏

2 Reactions

See more comments

To view or add a comment, sign in

More Relevant Posts

AI at Meta

818,723 followers
2mo
Report this post
Introducing Meta Llama 3: the next generation of our state-of-the-art open source large language model — and the most capable openly available LLM to date. These next-generation models demonstrate SOTA performance on a wide range of industry benchmarks and offer new capabilities such as improved reasoning. Details in the full announcement ➡️ https://go.fb.me/a24u0h Download the models ➡️ https://go.fb.me/q8yhmh Experience Llama 3 with Meta AI ➡️ https://meta.ai Llama 3 8B & 70B deliver a major leap over Llama 2 and establish a new SOTA for models of their sizes. While we’re releasing these first two models today, we’re working to release even more for Llama 3 including multiple models with capabilities such as multimodality, multilinguality, longer context windows and more. Our largest models are over 400B parameters and while they’re still in active development, we’re very excited about how they’re trending. Across the stack, we want to kickstart the next wave of innovation in AI. We believe these are the best open source models of their class, period — we can’t wait to see what you build and look forward to your feedback.

165 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

818,723 followers
5d
Report this post
New video! We're discussing some of the changes to the Meta Llama 3 Tokenizer with Aston Zhang, author of Dive into Deep Learning and researcher from the Llama team. This conversation covers the change from SentencePiece to Tiktoken and what this enables for our latest models. Watch the full video on YouTube ➡️ https://lnkd.in/geN8XWf3

15 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

818,723 followers
1w Edited
Report this post
Introducing Meta 3D Gen – new text-to-3D research from AI researchers at Meta that enables text-to-3D generation with high-quality geometry and textures. Research paper ➡️ https://go.fb.me/c9g4x6 Meta 3D Gen delivers text-to-mesh generation with high-quality geometry, texture and PBR materials. It can generate high-quality 3D assets, with both high-resolution textures and material maps end-to-end, producing results that are superior to previous state-of-the-art solutions — all at 3-10x the speed of previous work. In addition to the Meta 3D Gen technical report, we’re publishing our research on the two individual components of the Meta 3D Gen system: Meta 3D AssetGen for generating 3D models from text — and Meta 3D TextureGen, a model capable of high-quality texture generation and AI-assisted retexturing of artist-created or generated assets. Meta 3D AssetGen paper ➡️ https://go.fb.me/87tktg Meta 3D TextureGen paper ➡️ https://go.fb.me/tvbdf8

Meta 3D Gen: A new system for end-to-end generation of 3D assets from text in <1min

130 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

818,723 followers
1w
Report this post
Today we’re releasing Meta LLM Compiler, a family of models built on Meta Code Llama with additional code optimization and compiler capabilities. The models achieve state-of-the-art results on optimization of code size and disassembly tasks. Hugging Face repo ➡️ https://go.fb.me/tdd3dw Research paper ➡️ https://go.fb.me/hvfnam LLM Compiler can emulate the compiler, predict optimal passes for code size, and disassemble code. It can be fine-tuned for new optimizations and compiler tasks. This work shows that AI is learning to optimize code and can assist compiler experts in identifying opportunities to apply optimizations. We believe this work could have an impact ranging from use in optimization for individual developer environments to inclusion in a compiler such as LLVM. We’re releasing LLM Compiler 7B & 13B models under a permissive license for both research and commercial use in the hopes of making it easier for developers and researchers alike to leverage this in their work and carry forward new research in this highly impactful space.
139 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

818,723 followers
2w
Report this post
Know an organization using AI to drive social impact? We just opened applications for the Meta Llama Impact Innovation Awards, a program to recognize & support organizations using Meta Llama models to make an impact in Africa, the Middle East, Turkey, Asia Pacific & Latin America! The program will grant a series of awards up to $35K USD for organizations tackling some of the regions’ most pressing challenges. Accepting applications until July 26 ➡️ https://go.fb.me/g1m32d
2 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

818,723 followers
2w
Report this post
Last week we released Meta Chameleon: a new mixed-modal research model from Meta FAIR. Get the models ➡️ https://go.fb.me/hrkkgf Research paper ➡️ https://go.fb.me/u75dq8 The new safety tuned 7B and 34B models we’ve released for research use can take combinations of text and images as input and produce text outputs. Chameleon is one of the first publicly released approaches using a single unified architecture for both encoding and decoding using an early fusion approach. We’ve shared details on the full modeling approach and training in the new research paper, and we hope that the work we’re releasing will help to further democratize access to foundational mixed-modal models and empower others to streamline and scale their work in this space.

Introducing Meta Chameleon: Mixed-Modal Early-Fusion Foundation Models

11 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

818,723 followers
2w
Report this post
📣 Just announced in Paris: Together with Hugging Face & Scaleway, we're launching a new accelerator for AI startups in Europe! Accepting applications until August 16 ➡️ https://go.fb.me/1ixbz0 Based at STATION F in Paris, the aim is to accelerate innovation, drive business growth and strengthen the European tech ecosystem. The startups selected will benefit from technical mentoring by research teams at Meta FAIR, access to Hugging Face’s platform + tools and access to Scaleway’s computing power in order to support their work based on open source AI.
17 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

818,723 followers
2w Edited
Report this post
Six papers to add to your reading list from AI researchers at Meta at #CVPR2024. • PlatoNeRF: Discerning Reality in Plato's Cave from Single-View Two Bounce Time of Flight ➡️ https://go.fb.me/tju5fo • Nymeria: A Massive Collection of Multimodal Egocentric Daily Motion in the Wild ➡️ https://go.fb.me/0wcu84 • Relightable Gaussian Codec Avatars ➡️ https://go.fb.me/gdtkjm • URHand: Universal Relightable Hands ➡️ https://go.fb.me/1lmv7o • RoHM: Robust Human Motion Reconstruction via Diffusion ➡️ https://go.fb.me/ogm92y • HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces ➡️ https://go.fb.me/tzik3j
- +1
37 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

818,723 followers
2w
Report this post
With the release of Meta Llama 2, we developed a technique called Ghost Attention (GAtt) which uses RLHF to fine-tune model responses keeping the initial instructions in mind, resulting in AI models that are much better at retaining initial instructions in multi-turn conversations. This approach caught the attention of SAIF CHECK who went on to build a model evaluation system using Meta Llama 3 to help companies navigate the challenges of risk and compliance with local laws where their technology is used.

How SAIF CHECK is using Meta Llama 3 to validate and build trust in AI models

ai.meta.com

4 Comments
Like Comment
To view or add a comment, sign in

818,723 followers

View Profile Follow

AI at Meta’s Post

More Relevant Posts

Meta 3D Gen: A new system for end-to-end generation of 3D assets from text in <1min

Introducing Meta Chameleon: Mixed-Modal Early-Fusion Foundation Models

Explore topics