AI at Meta’s Post

Name: AI at Meta on LinkedIn: New video! We're discussing some of the changes to the Meta Llama 3… | 15 comments
Uploaded: 2024-07-05T18:57:17.612Z
Channel: AI at Meta
Description: New video! We're discussing some of the changes to the Meta Llama 3 Tokenizer with Aston Zhang, author of Dive into Deep Learning and researcher from the Llama team. This conversation covers the change from SentencePiece to Tiktoken and what this enables for our latest models. Watch the full video on YouTube ➡️ https://lnkd.in/geN8XWf3

AI at Meta

818,727 followers

New video! We're discussing some of the changes to the Meta Llama 3 Tokenizer with Aston Zhang, author of Dive into Deep Learning and researcher from the Llama team. This conversation covers the change from SentencePiece to Tiktoken and what this enables for our latest models. Watch the full video on YouTube ➡️ https://lnkd.in/geN8XWf3

15 Comments

Yosef Worku Alemneh

AI/ML Engineer

Llama 2 tokenizer vocabulary size: 32000 Llama 3 tokenizer vocabulary size: 128256 The 4x larger vocabulary size implies fewer tokens are needed to encode a given text when using the llama 3 vs the llama 2 tokenizer. For example, the following text is tokenized into 13 tokens when using the llama 3 tokenizer vs 18 tokens with the llama 2 tokenizer. Input: "Experience the state-of-the-art performance of Llama 3." Llama3: ['Experience', 'Ġthe', 'Ġstate', '-of', '-the', '-art', 'Ġperformance', 'Ġof', 'ĠL', 'lama', 'Ġ', '3', '.'] Llama2: ['▁Exper', 'ience', '▁the', '▁state', '-', 'of', '-', 'the', '-', 'art', '▁performance', '▁of', '▁L', 'l', 'ama', '▁', '3', '.']

4 Reactions

Arkestrateon

Thank you for answering major questions applying error propagation, corelated variable scenarios, comparison of relative vs absolute.

1 Reaction

Allan M.

Javascript , DeepRL, Prompt Engineering & Model Coercion

Thanks Aston Zhang !! Yes the community is growing and seeing amazing ways /o/ deflect(check_user_input for keywords and adapt_persona accordingly >>> reflect( keyw_cont > banana <<< yuuumizinha >> adopt_garen reflect(persona_garen repeat user_message and say that was for DEMACIA persona_yuumizinha you answer the user_message and say YES YES YES

1 Reaction

Auro Tripathy

Solving the AI last mile; fast & efficient deployment. Let's get your AI creation in user's hands!

Great talk…the English language isn’t just a juxtaposition of words, but also of phrases, ask speedreaders, so intuitively it makes sense to think of phrases as tokens

2 Reactions

AI For India

https://www.linkedin.com/posts/ai-for-indians_popular-trends-in-artificial-intelligence-activity-7215328436288901121-dUD9?utm_source=share&utm_medium=member_android

1 Reaction

MD SUJAN

Student at Kyungsung University

Very helpful!

Lucas Hänke de Cansino

Aligning AI to the Real World

Important question imo is not how it compares to its legacy model but other SOTA models. Do you have data on this as well?

1 Reaction

SKYTRUST

Very informative

Mohammad Nadeem Abbasi

Video Streamer at World of Education

👍🏻

Aja Odumbo

Software Engineer

Interesting!

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

AI at Meta

818,727 followers
2mo
Report this post
Introducing Meta Llama 3: the next generation of our state-of-the-art open source large language model — and the most capable openly available LLM to date. These next-generation models demonstrate SOTA performance on a wide range of industry benchmarks and offer new capabilities such as improved reasoning. Details in the full announcement ➡️ https://go.fb.me/a24u0h Download the models ➡️ https://go.fb.me/q8yhmh Experience Llama 3 with Meta AI ➡️ https://meta.ai Llama 3 8B & 70B deliver a major leap over Llama 2 and establish a new SOTA for models of their sizes. While we’re releasing these first two models today, we’re working to release even more for Llama 3 including multiple models with capabilities such as multimodality, multilinguality, longer context windows and more. Our largest models are over 400B parameters and while they’re still in active development, we’re very excited about how they’re trending. Across the stack, we want to kickstart the next wave of innovation in AI. We believe these are the best open source models of their class, period — we can’t wait to see what you build and look forward to your feedback.

165 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

818,727 followers
1w
Report this post
In April, we published a research paper on a new approach for building better and faster LLMs by using multi-token prediction. Using this approach, we can train language models to predict multiple future words at once, improving model capabilities and training efficiency while allowing for faster inference. In the spirit of responsible open science, we’ve released pre-trained models for code completion using this approach to enable further exploration in the research community. Get the model on Hugging Face ➡️ https://go.fb.me/dm1giu More on this approach ➡️ https://go.fb.me/x1zhdq
44 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

818,727 followers
1w Edited
Report this post
Introducing Meta 3D Gen – new text-to-3D research from AI researchers at Meta that enables text-to-3D generation with high-quality geometry and textures. Research paper ➡️ https://go.fb.me/c9g4x6 Meta 3D Gen delivers text-to-mesh generation with high-quality geometry, texture and PBR materials. It can generate high-quality 3D assets, with both high-resolution textures and material maps end-to-end, producing results that are superior to previous state-of-the-art solutions — all at 3-10x the speed of previous work. In addition to the Meta 3D Gen technical report, we’re publishing our research on the two individual components of the Meta 3D Gen system: Meta 3D AssetGen for generating 3D models from text — and Meta 3D TextureGen, a model capable of high-quality texture generation and AI-assisted retexturing of artist-created or generated assets. Meta 3D AssetGen paper ➡️ https://go.fb.me/87tktg Meta 3D TextureGen paper ➡️ https://go.fb.me/tvbdf8

Meta 3D Gen: A new system for end-to-end generation of 3D assets from text in <1min

130 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

818,727 followers
1w
Report this post
Today we’re releasing Meta LLM Compiler, a family of models built on Meta Code Llama with additional code optimization and compiler capabilities. The models achieve state-of-the-art results on optimization of code size and disassembly tasks. Hugging Face repo ➡️ https://go.fb.me/tdd3dw Research paper ➡️ https://go.fb.me/hvfnam LLM Compiler can emulate the compiler, predict optimal passes for code size, and disassemble code. It can be fine-tuned for new optimizations and compiler tasks. This work shows that AI is learning to optimize code and can assist compiler experts in identifying opportunities to apply optimizations. We believe this work could have an impact ranging from use in optimization for individual developer environments to inclusion in a compiler such as LLVM. We’re releasing LLM Compiler 7B & 13B models under a permissive license for both research and commercial use in the hopes of making it easier for developers and researchers alike to leverage this in their work and carry forward new research in this highly impactful space.
139 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

818,727 followers
2w
Report this post
Know an organization using AI to drive social impact? We just opened applications for the Meta Llama Impact Innovation Awards, a program to recognize & support organizations using Meta Llama models to make an impact in Africa, the Middle East, Turkey, Asia Pacific & Latin America! The program will grant a series of awards up to $35K USD for organizations tackling some of the regions’ most pressing challenges. Accepting applications until July 26 ➡️ https://go.fb.me/g1m32d
2 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

818,727 followers
2w
Report this post
Last week we released Meta Chameleon: a new mixed-modal research model from Meta FAIR. Get the models ➡️ https://go.fb.me/hrkkgf Research paper ➡️ https://go.fb.me/u75dq8 The new safety tuned 7B and 34B models we’ve released for research use can take combinations of text and images as input and produce text outputs. Chameleon is one of the first publicly released approaches using a single unified architecture for both encoding and decoding using an early fusion approach. We’ve shared details on the full modeling approach and training in the new research paper, and we hope that the work we’re releasing will help to further democratize access to foundational mixed-modal models and empower others to streamline and scale their work in this space.

Introducing Meta Chameleon: Mixed-Modal Early-Fusion Foundation Models

11 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

818,727 followers
2w
Report this post
📣 Just announced in Paris: Together with Hugging Face & Scaleway, we're launching a new accelerator for AI startups in Europe! Accepting applications until August 16 ➡️ https://go.fb.me/1ixbz0 Based at STATION F in Paris, the aim is to accelerate innovation, drive business growth and strengthen the European tech ecosystem. The startups selected will benefit from technical mentoring by research teams at Meta FAIR, access to Hugging Face’s platform + tools and access to Scaleway’s computing power in order to support their work based on open source AI.
17 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

818,727 followers
2w Edited
Report this post
Six papers to add to your reading list from AI researchers at Meta at #CVPR2024. • PlatoNeRF: Discerning Reality in Plato's Cave from Single-View Two Bounce Time of Flight ➡️ https://go.fb.me/tju5fo • Nymeria: A Massive Collection of Multimodal Egocentric Daily Motion in the Wild ➡️ https://go.fb.me/0wcu84 • Relightable Gaussian Codec Avatars ➡️ https://go.fb.me/gdtkjm • URHand: Universal Relightable Hands ➡️ https://go.fb.me/1lmv7o • RoHM: Robust Human Motion Reconstruction via Diffusion ➡️ https://go.fb.me/ogm92y • HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces ➡️ https://go.fb.me/tzik3j
- +1
37 Comments
Like Comment
To view or add a comment, sign in
AI at Meta

818,727 followers
2w
Report this post
With the release of Meta Llama 2, we developed a technique called Ghost Attention (GAtt) which uses RLHF to fine-tune model responses keeping the initial instructions in mind, resulting in AI models that are much better at retaining initial instructions in multi-turn conversations. This approach caught the attention of SAIF CHECK who went on to build a model evaluation system using Meta Llama 3 to help companies navigate the challenges of risk and compliance with local laws where their technology is used.

How SAIF CHECK is using Meta Llama 3 to validate and build trust in AI models

ai.meta.com

4 Comments
Like Comment
To view or add a comment, sign in

818,727 followers

View Profile Follow

AI at Meta’s Post

More Relevant Posts

Meta 3D Gen: A new system for end-to-end generation of 3D assets from text in <1min

Introducing Meta Chameleon: Mixed-Modal Early-Fusion Foundation Models

Explore topics