Hugging Face

Hugging Face

Software Development

The AI community building the future.

About us

The AI community building the future.

Website
https://huggingface.co
Industry
Software Development
Company size
51-200 employees
Type
Privately Held
Founded
2016
Specialties
machine learning, natural language processing, and deep learning

Products

Locations

Employees at Hugging Face

Updates

  • Hugging Face reposted this

    View profile for Ahsen Khaliq, graphic

    ML @ Hugging Face

    Google presents PaliGemma A versatile 3B VLM for transfer paper page: https://lnkd.in/eQevVDJh PaliGemma is an open Vision-Language Model (VLM) that is based on the SigLIP-So400m vision encoder and the Gemma-2B language model. It is trained to be a versatile and broadly knowledgeable base model that is effective to transfer. It achieves strong performance on a wide variety of open-world tasks. We evaluate PaliGemma on almost 40 diverse tasks including standard VLM benchmarks, but also more specialized tasks such as remote-sensing and segmentation.

    • No alternative text description for this image
  • Hugging Face reposted this

    View profile for Ahsen Khaliq, graphic

    ML @ Hugging Face

    Tencent presents MiraData A Large-Scale Video Dataset with Long Durations and Structured Captions paper page: https://lnkd.in/eSSQ5XzQ Sora's high-motion intensity and long consistent videos have significantly impacted the field of video generation, attracting unprecedented attention. However, existing publicly available datasets are inadequate for generating Sora-like videos, as they mainly contain short videos with low motion intensity and brief captions. To address these issues, we propose MiraData, a high-quality video dataset that surpasses previous ones in video duration, caption detail, motion strength, and visual quality. We curate MiraData from diverse, manually selected sources and meticulously process the data to obtain semantically consistent clips. GPT-4V is employed to annotate structured captions, providing detailed descriptions from four different perspectives along with a summarized dense caption. To better assess temporal consistency and motion intensity in video generation, we introduce MiraBench, which enhances existing benchmarks by adding 3D consistency and tracking-based motion strength metrics. MiraBench includes 150 evaluation prompts and 17 metrics covering temporal consistency, motion strength, 3D consistency, visual quality, text-video alignment, and distribution similarity. To demonstrate the utility and effectiveness of MiraData, we conduct experiments using our DiT-based video generation model, MiraDiT. The experimental results on MiraBench demonstrate the superiority of MiraData, especially in motion strength.

  • Hugging Face reposted this

    View profile for Merve Noyan, graphic

    open-sourceress at 🤗 | Google Developer Expert in Machine Learning, MSc Candidate in Data Science

    The bleeding-edge alignment technique DPO for vision language models is now available in Hugging Face TRL along with LoRA/QLoRA ⚡️ Links and more in comments 🔖 DPO is a popular cutting-edge alignment technique for language models. TLDR; a (preference) model is trained using a dataset of inputs and chosen and rejected outputs, and this model generates scores for each input. the main model is fine-tuned using the scores. Essentially DPO in vision language models is pretty similar, since vision language models are models that take in images projected to text embedding space, it's just input tokens output tokens.  Quentin Gallouédec implemented support for Idefics2, Llava 1.5, and PaliGemma in TRL. 👏 as of now, VLM processors are quite non-standard, only difference is due to processor and chat templates themselves, you can implement it very easily (see his PR in links) Thanks to TRL's support for PEFT and bitsandbytes you can also try QLoRA and LoRA fine-tuning (which comes in blog post) 😏 Please try the scripts, share your models and let us know how it goes!

    • No alternative text description for this image
  • Hugging Face reposted this

    View organization page for Gradio, graphic

    22,434 followers

    🌟Introducing Transcription Delight!🌟 Effortlessly Generate Transcripts from any YouTube video (or any uploaded video/audio)! This App is super cool 😎 & incredibly handy! 🛠 It also refines your transcript with an LLM, transforming it into a polished markdown output for your downstream needs😍 🤔For example, You can use the markdown transcript as input in Claude/GPT-4, and get ready to throw questions at it or summarize with ease!🚀📝 🔥Transcription Delight is an app created by Abubakar Abid🙌 -- Dive into this Gradio app and up your game! Try on Hugging Face Spaces: https://lnkd.in/gHFyHFbr OR Build the app locally in three lines of Code, by doing: 👉 𝚐𝚒𝚝 𝚌𝚕𝚘𝚗𝚎 𝚑𝚝𝚝𝚙𝚜://𝚑𝚞𝚐𝚐𝚒𝚗𝚐𝚏𝚊𝚌𝚎.𝚌𝚘/𝚜𝚙𝚊𝚌𝚎𝚜/𝚊𝚋𝚒𝚍𝚕𝚊𝚋𝚜/𝚝𝚛𝚊𝚗𝚜𝚌𝚛𝚒𝚙𝚝𝚒𝚘𝚗-𝚍𝚎𝚕𝚒𝚐𝚑𝚝 👉 𝚌𝚍 𝚝𝚛𝚊𝚗𝚜𝚌𝚛𝚒𝚙𝚝𝚒𝚘𝚗-𝚍𝚎𝚕𝚒𝚐𝚑𝚝 👉 𝚙𝚢𝚝𝚑𝚘𝚗 𝚊𝚙𝚙.𝚙𝚢

  • Hugging Face reposted this

    View profile for Florent Gbelidji, graphic

    Practical ML @ Hugging Face 🤗

    𝐂𝐚𝐬𝐞 𝐒𝐭𝐮𝐝𝐲 𝐨𝐧 𝐋𝐋𝐌 𝐀𝐩𝐩𝐥𝐢𝐜𝐚𝐭𝐢𝐨𝐧𝐬: 𝐑𝐞𝐧𝐨𝐯𝐚𝐭𝐢𝐧𝐠 𝐏𝐮𝐛𝐥𝐢𝐜 𝐒𝐜𝐡𝐨𝐨𝐥 𝐈𝐧𝐟𝐫𝐚𝐬𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞 𝐢𝐧 𝐅𝐫𝐚𝐧𝐜𝐞. 🏦 Banque des Territoires, a French public financial institution, has launched EduRénov, a program aiming to support and finance the energy renovation of 10,000 public school buildings with 2 billion euros in loans. The goal is to achieve 40% energy savings in 5 years as part of France's national strategy for energy transformation. ❓ A typical question from a local community official might be, "𝐼𝑛 2025, 𝑜𝑢𝑟 𝑐𝑖𝑡𝑦 𝑤𝑜𝑢𝑙𝑑 𝑙𝑖𝑘𝑒 𝑡𝑜 𝑖𝑛𝑖𝑡𝑖𝑎𝑡𝑒 𝑡ℎ𝑒 𝑟𝑒𝑛𝑜𝑣𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑖𝑡𝑠 𝑠𝑐ℎ𝑜𝑜𝑙. 𝑊ℎ𝑎𝑡 𝑡𝑦𝑝𝑒 𝑜𝑓 𝑙𝑜𝑎𝑛𝑠 𝑐𝑎𝑛 𝑤𝑒 𝑔𝑒𝑡 𝑡𝑜 𝑓𝑢𝑛𝑑 𝑡ℎ𝑒 𝑝𝑟𝑜𝑗𝑒𝑐𝑡?". Now, generative AI can provide accurate answers to such queries. 🤝 Indeed, Hugging Face collaborates with Banque des Territoires and Polyconseil on an AI solution that facilitates communication between local officials and Banque des Territoires representatives. This solution aims to increase the number of initiated renovation projects. 📖 Read our blog post to learn more about how open-source LLMs, RAG, and sovereign cloud solutions are used in this initiative: https://lnkd.in/e-9FPPqq 👨🏾🏫 I am proud to contribute to this project through Hugging Face’s Expert Support Program (https://lnkd.in/eNgDgTeQ) and share expertise in ML solutions development with Polyconseil and Banque des Territoires. 🎉 Kudos to Henri Jouhaud , Anthony Truchet, Jeremy Cailton, Emma de Corbière, Johnny CHEN, Sébastien Ehling, Amaury Dreux, Nicolas T., Hakim Lahlou, Baptiste Bontoux, Marie Lattelais, Jean-Christophe Bernigaud, David Buchner and 🤗 Adrien Dufayard 🤗!

    Banque des Territoires (CDC Group) x Polyconseil x Hugging Face: Enhancing a Major French Environmental Program with a Sovereign Data Solution

    Banque des Territoires (CDC Group) x Polyconseil x Hugging Face: Enhancing a Major French Environmental Program with a Sovereign Data Solution

    huggingface.co

  • Hugging Face reposted this

    View profile for Ahsen Khaliq, graphic

    ML @ Hugging Face

    UltraEdit Instruction-based Fine-Grained Image Editing at Scale paper page: https://lnkd.in/e762P2Me This paper presents UltraEdit, a large-scale (approximately 4 million editing samples), automatically generated dataset for instruction-based image editing. Our key idea is to address the drawbacks in existing image editing datasets like InstructPix2Pix and MagicBrush, and provide a systematic approach to producing massive and high-quality image editing samples. UltraEdit offers several distinct advantages: 1) It features a broader range of editing instructions by leveraging the creativity of large language models (LLMs) alongside in-context editing examples from human raters; 2) Its data sources are based on real images, including photographs and artworks, which provide greater diversity and reduced bias compared to datasets solely generated by text-to-image models; 3) It also supports region-based editing, enhanced by high-quality, automatically produced region annotations. Our experiments show that canonical diffusion-based editing baselines trained on UltraEdit set new records on MagicBrush and Emu-Edit benchmarks. Our analysis further confirms the crucial role of real image anchors and region-based editing data.

    • No alternative text description for this image
  • Hugging Face reposted this

    View organization page for Gradio, graphic

    22,434 followers

    🔥Introducing VILA: The VLM for Video & Interleaved Image understanding with Apache-2.0 license & edge-deployment capabilities! Is VILA the best in-class small VLM for edge-deployment? Keep reading..👀 VILA uses interleaved image-text pretraining and it unlocks cool capabilities: - Video & multi-image understanding 🎬 - In-context learning 🧠, and - Visual chain-of-thought 💡 Experience the future -- Edge deployment 📱of VLMs! Try the demo now! 👇 🔥How to try out VILA? Multiple ways: ✅Easiest is to Launch the VILA Gradio app locally on your machine: https://lnkd.in/g475Hc5J 🚀 OR play with app directly here: https://lnkd.in/gkA2vUJ3 👍You can also Launch the inference locally for 3B Video model or for 8B/40B image models: https://lnkd.in/gpW-QTqK 🤩 OR Deploy AWQ 4 bits quantized version of VILA on the edge with TinyChat: https://lnkd.in/gdfb_ZTD 💎 Useful Resources: - VILA Code: https://lnkd.in/gabJmpjv - Deployed Gradio VILA app: https://lnkd.in/gkA2vUJ3 - Model collection on @huggingface: https://lnkd.in/gcdT9dMA Demo coming soon on 🤗 Spaces. Stay Tuned!

  • Hugging Face reposted this

    View organization page for Alivia, graphic

    180 followers

    [#partenariat] Nous vous l’avions annoncé récemment, nous collaborons avec la Banque des Territoires du Groupe Caisse des Dépôts et Hugging Face, dans le déploiement d’une solution sur-mesure et souveraine. 🎯 L’objectif ? Faciliter l'accompagnement offert par les agents de la Banque des Territoires auprès des collectivités, dans le cadre du projet EduRénov. 🚀 Entre enjeux de souveraineté, d’optimisation et de sécurisation des données, découvrez au travers de cet article, les étapes qui ont menées au déploiement de la première version du système. 👉 L’article est disponible ici : https://lnkd.in/e5Mjy-VD, ainsi que directement consultable et téléchargeable ci-dessous. Vous avez un projet de transformation numérique ? L’IA générative peut vous permettre d’atteindre vos objectifs ! Contactez-nous ➡ [email protected]

  • Hugging Face reposted this

    View organization page for Gradio, graphic

    22,434 followers

    🚀Create animated portraits (like the attached video clip) using the official LivePortrait Gradio app on Hugging Face Spaces! MIT License. Keep reading for more info, examples and demo link👇 Produce lifelike videos from a single source image and input motion 🤯 Model is trained on 69M high-quality frames, and thus produces high quality outputs Fast generation on ZeroGPU (A100s) on Spaces for free! Demo link: https://lnkd.in/gXSd5VkB

  • Hugging Face reposted this

    View profile for Philipp Schmid, graphic

    Technical Lead & LLMs at Hugging Face 🤗 | AWS ML HERO 🦸🏻♂️

    Build, Train, and Deploy AI Models with Google TPUs on Hugging Face! We're excited to announce the General Availability of Google TPUs on Hugging Face. Hugging Face users can now use the power of Google Cloud TPUs in both Spaces and Inference Endpoints to build, train, and deploy your Generative AI models. 🚀 TL;DR: 🚀 Google Cloud TPUs are available on Spaces and Inference Endpoints. 💡 3 new options from 16GB to 128GB TPU memory (1x1, 2x2, 2x4 v5e TPU) in us-west1 🛠 Use TPU in Spaces for ML demos or dev mode to easily training. 📈 Deploy LLMs starting with Meta Llama 3 and Google DeepMind Gemma with Mistral and others to follow on Inference Endpoints 🔄 New Text Generation Inference backend now supports Google TPUs. 🌟 Starting at just $1.38/hour. Blog: https://lnkd.in/e_au-mqt Spaces: https://lnkd.in/eCun-cb9 Inference Endpoints: https://lnkd.in/eqks3UKd Big Kudos to Alvaro Moran, Morgan Funtowicz, Simon Pagezy, Thibault Goehringer, Michelle Habonneau, Christophe Rannou, and the whole HF team for bringing Google TPUs to every Hugging Face user!

    • No alternative text description for this image

Similar pages

Browse jobs

Funding

Hugging Face 7 total rounds

Last Round

Series D
See more info on crunchbase