Jump to content

Hugging Face: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
m v2.05b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation)
Undid revision 1234447414 by 122.161.78.50 (talk)
(20 intermediate revisions by 15 users not shown)
Line 24: Line 24:
}}
}}


'''Hugging Face, Inc.''' is a French-American company that develops tools for building applications using [[machine learning]], based in [[List of tech companies in the New York metropolitan area|New York City]]. It is most notable for its [[Transformer (machine learning model)|transformers]] [[Software libraries|library]] built for [[natural language processing]] applications and its platform that allows users to share machine learning models and [[Dataset (machine learning)|datasets]] and showcase their work.
'''Hugging Face, Inc.''' is a French-American company incorporated under the [[Delaware General Corporation Law]]<ref>{{Cite web |title=Terms of Service – Hugging Face |url=https://huggingface.co/terms-of-service |access-date=2024-05-24 |website=huggingface.co}}</ref> and based in [[List of tech companies in the New York metropolitan area|New York City]] that develops [[computation]] tools for building applications using [[machine learning]]. It is most notable for its [[Transformer (machine learning model)|transformers]] [[Software libraries|library]] built for [[natural language processing]] applications and its platform that allows users to share machine learning models and [[Dataset (machine learning)|datasets]] and showcase their work.


== History ==
== History ==
Line 33: Line 33:
On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open [[large language model]].<ref>{{cite web |date=10 January 2022 |title=Inside BigScience, the quest to build a powerful open language model |url=https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/ |access-date=5 August 2022 |archive-date=1 July 2022 |archive-url=https://web.archive.org/web/20220701073233/https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/ |url-status=live }}</ref> In 2022, the workshop concluded with the announcement of [[BLOOM (language model)|BLOOM]], a multilingual large language model with 176 billion parameters.<ref>{{Cite web |title=BLOOM |url=https://bigscience.huggingface.co/blog/bloom |access-date=2022-08-20 |website=bigscience.huggingface.co |archive-date=2022-11-14 |archive-url=https://web.archive.org/web/20221114122342/https://bigscience.huggingface.co/blog/bloom |url-status=live }}</ref><ref>{{Cite web |title=Inside a radical new project to democratize AI |url=https://www.technologyreview.com/2022/07/12/1055817/inside-a-radical-new-project-to-democratize-ai/ |access-date=2023-08-25 |website=MIT Technology Review |language=en |archive-date=2022-12-04 |archive-url=https://web.archive.org/web/20221204184214/https://www.technologyreview.com/2022/07/12/1055817/inside-a-radical-new-project-to-democratize-ai/ |url-status=live }}</ref>
On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open [[large language model]].<ref>{{cite web |date=10 January 2022 |title=Inside BigScience, the quest to build a powerful open language model |url=https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/ |access-date=5 August 2022 |archive-date=1 July 2022 |archive-url=https://web.archive.org/web/20220701073233/https://venturebeat.com/2022/01/10/inside-bigscience-the-quest-to-build-a-powerful-open-language-model/ |url-status=live }}</ref> In 2022, the workshop concluded with the announcement of [[BLOOM (language model)|BLOOM]], a multilingual large language model with 176 billion parameters.<ref>{{Cite web |title=BLOOM |url=https://bigscience.huggingface.co/blog/bloom |access-date=2022-08-20 |website=bigscience.huggingface.co |archive-date=2022-11-14 |archive-url=https://web.archive.org/web/20221114122342/https://bigscience.huggingface.co/blog/bloom |url-status=live }}</ref><ref>{{Cite web |title=Inside a radical new project to democratize AI |url=https://www.technologyreview.com/2022/07/12/1055817/inside-a-radical-new-project-to-democratize-ai/ |access-date=2023-08-25 |website=MIT Technology Review |language=en |archive-date=2022-12-04 |archive-url=https://web.archive.org/web/20221204184214/https://www.technologyreview.com/2022/07/12/1055817/inside-a-radical-new-project-to-democratize-ai/ |url-status=live }}</ref>


In December 2022, the company acquired Gradio, an open source library built for developing machine learning applications in python.<ref>https://huggingface.co/blog/gradio-joins-hf</ref>
In December 2022, the company acquired Gradio, an open source library built for developing machine learning applications in Python.<ref>{{Cite web |last=Nataraj |first=Poornima |date=2021-12-23 |title=Hugging Face Acquires Gradio, A Customizable UI Components Library For Python |url=https://analyticsindiamag.com/hugging-face-acquires-gradio-a-customizable-ui-components-library-for-python/ |access-date=2024-01-26 |website=Analytics India Magazine |language=en-US}}</ref>


On May 5, 2022, the company announced its [[Series C]] funding round led by [[Coatue Management|Coatue]] and [[Sequoia fund|Sequoia]].<ref>{{Cite web |last=Cai |first=Kenrick |title=The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution |url=https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |access-date=2022-08-20 |website=Forbes |language=en |archive-date=2022-11-03 |archive-url=https://web.archive.org/web/20221103121236/https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |url-status=live }}</ref> The company received a $2 billion valuation.
On May 5, 2022, the company announced its [[Series C]] funding round led by [[Coatue Management|Coatue]] and [[Sequoia fund|Sequoia]].<ref>{{Cite web |last=Cai |first=Kenrick |title=The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution |url=https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |access-date=2022-08-20 |website=Forbes |language=en |archive-date=2022-11-03 |archive-url=https://web.archive.org/web/20221103121236/https://www.forbes.com/sites/kenrickcai/2022/05/09/the-2-billion-emoji-hugging-face-wants-to-be-launchpad-for-a-machine-learning-revolution/ |url-status=live }}</ref> The company received a $2 billion valuation.
Line 42: Line 42:


In August 2023, the company announced that it raised $235 million in a [[Series D]] funding, at a $4.5 billion valuation. The funding was led by [[Salesforce]], and notable participation came from [[Google]], [[Amazon (company)|Amazon]], [[Nvidia]], [[AMD]], [[Intel]], [[IBM]], and [[Qualcomm]].<ref>{{Cite web |last=Leswing |first=Kif |date=2023-08-24 |title=Google, Amazon, Nvidia and other tech giants invest in AI startup Hugging Face, sending its valuation to $4.5 billion |url=https://www.cnbc.com/2023/08/24/google-amazon-nvidia-amd-other-tech-giants-invest-in-hugging-face.html |access-date=2023-08-24 |website=CNBC |language=en |archive-date=2023-08-24 |archive-url=https://web.archive.org/web/20230824141538/https://www.cnbc.com/2023/08/24/google-amazon-nvidia-amd-other-tech-giants-invest-in-hugging-face.html |url-status=live }}</ref>
In August 2023, the company announced that it raised $235 million in a [[Series D]] funding, at a $4.5 billion valuation. The funding was led by [[Salesforce]], and notable participation came from [[Google]], [[Amazon (company)|Amazon]], [[Nvidia]], [[AMD]], [[Intel]], [[IBM]], and [[Qualcomm]].<ref>{{Cite web |last=Leswing |first=Kif |date=2023-08-24 |title=Google, Amazon, Nvidia and other tech giants invest in AI startup Hugging Face, sending its valuation to $4.5 billion |url=https://www.cnbc.com/2023/08/24/google-amazon-nvidia-amd-other-tech-giants-invest-in-hugging-face.html |access-date=2023-08-24 |website=CNBC |language=en |archive-date=2023-08-24 |archive-url=https://web.archive.org/web/20230824141538/https://www.cnbc.com/2023/08/24/google-amazon-nvidia-amd-other-tech-giants-invest-in-hugging-face.html |url-status=live }}</ref>

In June 2024, the company announced, along with [[Meta Platforms|Meta]] and [[Scaleway]], their launch of a new AI accelerator program for European startups. This initiative aims to help startups integrate open foundation models into their products, accelerating the EU AI ecosystem. The program, based at STATION F in Paris, will run from September 2024 to February 2025. Selected startups will receive mentoring, access to AI models and tools, and Scaleway’s computing power.<ref>{{Cite web |date=2024-06-25 |title=META Collaboration Launches AI Accelerator for European Startups |url=https://finance.yahoo.com/news/meta-collaboration-launches-ai-accelerator-151500146.html |access-date=2024-07-11 |website=Yahoo Finance |language=en-US}}</ref>


== Services and technologies ==
== Services and technologies ==
=== Transformers Library ===
=== Transformers Library ===
The Transformers library is a [[Python (programming language)|Python]] package that contains open-source implementations of [[Transformer (machine learning model)|transformer]] models for text, image, and audio tasks. It is compatible with the [[PyTorch]], [[TensorFlow]] and [[Google JAX|JAX]] [[deep learning]] libraries and includes implementations of notable models like [[BERT (language model)|BERT]] and [[GPT-2]].<ref>{{Cite web |title=🤗 Transformers |url=https://huggingface.co/docs/transformers/index |access-date=2022-08-20 |website=huggingface.co |archive-date=2023-09-27 |archive-url=https://web.archive.org/web/20230927023923/https://huggingface.co/docs/transformers/index |url-status=live }}</ref> The library was originally called "pytorch-pretrained-bert"<ref>{{cite web |date=Nov 17, 2018 |title=First release |url=https://github.com/huggingface/transformers/releases/tag/v0.1.2 |access-date=28 March 2023 |website=GitHub |archive-date=30 April 2023 |archive-url=https://web.archive.org/web/20230430011038/https://github.com/huggingface/transformers/releases/tag/v0.1.2 |url-status=live }}</ref> which was then renamed to "pytorch-transformers" and finally "transformers."
The Transformers library is a [[Python (programming language)|Python]] package that contains open-source implementations of [[Transformer (machine learning model)|transformer]] models for text, image, and audio tasks. It is compatible with the [[PyTorch]], [[TensorFlow]] and [[Google JAX|JAX]] [[deep learning]] libraries and includes implementations of notable models like [[BERT (language model)|BERT]] and [[GPT-2]].<ref>{{Cite web |title=🤗 Transformers |url=https://huggingface.co/docs/transformers/index |access-date=2022-08-20 |website=huggingface.co |archive-date=2023-09-27 |archive-url=https://web.archive.org/web/20230927023923/https://huggingface.co/docs/transformers/index |url-status=live }}</ref> The library was originally called "pytorch-pretrained-bert"<ref>{{cite web |date=Nov 17, 2018 |title=First release |url=https://github.com/huggingface/transformers/releases/tag/v0.1.2 |access-date=28 March 2023 |website=GitHub |archive-date=30 April 2023 |archive-url=https://web.archive.org/web/20230430011038/https://github.com/huggingface/transformers/releases/tag/v0.1.2 |url-status=live }}</ref> which was then renamed to "pytorch-transformers" and finally "transformers."

A [[JavaScript|javascript]] version (transformers.js<ref>{{cite web |title=xenova/transformers.js |url=https://github.com/xenova/transformers.js |website=GitHub}}</ref>) have also been developed, allowing to run models directly in the browser.


=== Hugging Face Hub ===
=== Hugging Face Hub ===
Line 54: Line 58:
* datasets, mainly in text, images, and audio;
* datasets, mainly in text, images, and audio;
* web applications ("spaces" and "widgets"), intended for small-scale demos of machine learning applications.
* web applications ("spaces" and "widgets"), intended for small-scale demos of machine learning applications.

There are numerous pre-trained models that support common tasks in different modalities, such as:
* [[Natural language processing|Natural Language Processing]]: text classification, named entity recognition, question answering, language modeling, summarization, translation, multiple choice, and text generation.
* [[Computer vision|Computer Vision]]: image classification, object detection, and segmentation.
* Audio: automatic speech recognition and audio classification.


=== Other libraries ===
=== Other libraries ===
[[File:Gradio example.png|thumb|Gradio UI Example]]
In addition to Transformers and the Hugging Face Hub, the Hugging Face ecosystem contains libraries for other tasks, such as [[Data processing|dataset processing]] ("Datasets"), model evaluation ("Evaluate"), simulation ("Simulate"), machine learning demos ("Gradio").<ref>{{Cite web |title=Hugging Face - Documentation |url=https://huggingface.co/docs |access-date=2023-02-18 |website=huggingface.co |archive-date=2023-09-30 |archive-url=https://web.archive.org/web/20230930074626/https://huggingface.co/docs |url-status=live }}</ref>
In addition to Transformers and the Hugging Face Hub, the Hugging Face ecosystem contains libraries for other tasks, such as [[Data processing|dataset processing]] ("Datasets"), model evaluation ("Evaluate"), and machine learning demos ("Gradio").<ref>{{Cite web |title=Hugging Face - Documentation |url=https://huggingface.co/docs |access-date=2023-02-18 |website=huggingface.co |archive-date=2023-09-30 |archive-url=https://web.archive.org/web/20230930074626/https://huggingface.co/docs |url-status=live }}</ref>


== See also ==
== See also ==

Revision as of 11:58, 14 July 2024

Hugging Face, Inc.
Company typePrivate
IndustryArtificial intelligence, machine learning, software development
Founded2016; 8 years ago (2016)
Headquarters
Area served
Worldwide
Key people
  • Clément Delangue (CEO)
  • Julien Chaumond (CTO)
  • Thomas Wolf (CSO)
ProductsModels, datasets, spaces
Revenue15,000,000 United States dollar (2022) Edit this on Wikidata
Number of employees
170 (2023) Edit this on Wikidata
Websitehuggingface.co

Hugging Face, Inc. is a French-American company incorporated under the Delaware General Corporation Law[1] and based in New York City that develops computation tools for building applications using machine learning. It is most notable for its transformers library built for natural language processing applications and its platform that allows users to share machine learning models and datasets and showcase their work.

History

The company was founded in 2016 by French entrepreneurs Clément Delangue, Julien Chaumond, and Thomas Wolf in New York City, originally as a company that developed a chatbot app targeted at teenagers.[2] The company was named after the "hugging face" emoji.[2] After open sourcing the model behind the chatbot, the company pivoted to focus on being a platform for machine learning.

In March 2021, Hugging Face raised US$40 million in a Series B funding round.[3]

On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model.[4] In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large language model with 176 billion parameters.[5][6]

In December 2022, the company acquired Gradio, an open source library built for developing machine learning applications in Python.[7]

On May 5, 2022, the company announced its Series C funding round led by Coatue and Sequoia.[8] The company received a $2 billion valuation.

On August 3, 2022, the company announced the Private Hub, an enterprise version of its public Hugging Face Hub that supports SaaS or on-premises deployment.[9]

In February 2023, the company announced partnership with Amazon Web Services (AWS) which would allow Hugging Face's products available to AWS customers to use them as the building blocks for their custom applications. The company also said the next generation of BLOOM will be run on Trainium, a proprietary machine learning chip created by AWS.[10][11][12]

In August 2023, the company announced that it raised $235 million in a Series D funding, at a $4.5 billion valuation. The funding was led by Salesforce, and notable participation came from Google, Amazon, Nvidia, AMD, Intel, IBM, and Qualcomm.[13]

In June 2024, the company announced, along with Meta and Scaleway, their launch of a new AI accelerator program for European startups. This initiative aims to help startups integrate open foundation models into their products, accelerating the EU AI ecosystem. The program, based at STATION F in Paris, will run from September 2024 to February 2025. Selected startups will receive mentoring, access to AI models and tools, and Scaleway’s computing power.[14]

Services and technologies

Transformers Library

The Transformers library is a Python package that contains open-source implementations of transformer models for text, image, and audio tasks. It is compatible with the PyTorch, TensorFlow and JAX deep learning libraries and includes implementations of notable models like BERT and GPT-2.[15] The library was originally called "pytorch-pretrained-bert"[16] which was then renamed to "pytorch-transformers" and finally "transformers."

A javascript version (transformers.js[17]) have also been developed, allowing to run models directly in the browser.

Hugging Face Hub

The Hugging Face Hub is a platform (centralized web service) for hosting:[18]

  • Git-based code repositories, including discussions and pull requests for projects.
  • models, also with Git-based version control;
  • datasets, mainly in text, images, and audio;
  • web applications ("spaces" and "widgets"), intended for small-scale demos of machine learning applications.

There are numerous pre-trained models that support common tasks in different modalities, such as:

  • Natural Language Processing: text classification, named entity recognition, question answering, language modeling, summarization, translation, multiple choice, and text generation.
  • Computer Vision: image classification, object detection, and segmentation.
  • Audio: automatic speech recognition and audio classification.

Other libraries

Gradio UI Example

In addition to Transformers and the Hugging Face Hub, the Hugging Face ecosystem contains libraries for other tasks, such as dataset processing ("Datasets"), model evaluation ("Evaluate"), and machine learning demos ("Gradio").[19]

See also

References

  1. ^ "Terms of Service – Hugging Face". huggingface.co. Retrieved 2024-05-24.
  2. ^ a b "Hugging Face wants to become your artificial BFF". TechCrunch. 9 March 2017. Archived from the original on 2022-09-25. Retrieved 2023-09-17.
  3. ^ "Hugging Face raises $40 million for its natural language processing library". 11 March 2021. Archived from the original on 28 July 2023. Retrieved 5 August 2022.
  4. ^ "Inside BigScience, the quest to build a powerful open language model". 10 January 2022. Archived from the original on 1 July 2022. Retrieved 5 August 2022.
  5. ^ "BLOOM". bigscience.huggingface.co. Archived from the original on 2022-11-14. Retrieved 2022-08-20.
  6. ^ "Inside a radical new project to democratize AI". MIT Technology Review. Archived from the original on 2022-12-04. Retrieved 2023-08-25.
  7. ^ Nataraj, Poornima (2021-12-23). "Hugging Face Acquires Gradio, A Customizable UI Components Library For Python". Analytics India Magazine. Retrieved 2024-01-26.
  8. ^ Cai, Kenrick. "The $2 Billion Emoji: Hugging Face Wants To Be Launchpad For A Machine Learning Revolution". Forbes. Archived from the original on 2022-11-03. Retrieved 2022-08-20.
  9. ^ "Introducing the Private Hub: A New Way to Build With Machine Learning". huggingface.co. Archived from the original on 2022-11-14. Retrieved 2022-08-20.
  10. ^ Bass, Dina (2023-02-21). "Amazon's Cloud Unit Partners With Startup Hugging Face as AI Deals Heat Up". Bloomberg News. Archived from the original on 2023-05-22. Retrieved 2023-02-22.
  11. ^ Nellis, Stephen (2023-02-21). "Amazon Web Services pairs with Hugging Face to target AI developers". Reuters. Archived from the original on 2023-05-30. Retrieved 2023-02-22.
  12. ^ "AWS and Hugging Face collaborate to make generative AI more accessible and cost efficient | AWS Machine Learning Blog". aws.amazon.com. 2023-02-21. Archived from the original on 2023-08-25. Retrieved 2023-08-25.
  13. ^ Leswing, Kif (2023-08-24). "Google, Amazon, Nvidia and other tech giants invest in AI startup Hugging Face, sending its valuation to $4.5 billion". CNBC. Archived from the original on 2023-08-24. Retrieved 2023-08-24.
  14. ^ "META Collaboration Launches AI Accelerator for European Startups". Yahoo Finance. 2024-06-25. Retrieved 2024-07-11.
  15. ^ "🤗 Transformers". huggingface.co. Archived from the original on 2023-09-27. Retrieved 2022-08-20.
  16. ^ "First release". GitHub. Nov 17, 2018. Archived from the original on 30 April 2023. Retrieved 28 March 2023.
  17. ^ "xenova/transformers.js". GitHub.
  18. ^ "Hugging Face Hub documentation". huggingface.co. Archived from the original on 2023-09-20. Retrieved 2022-08-20.
  19. ^ "Hugging Face - Documentation". huggingface.co. Archived from the original on 2023-09-30. Retrieved 2023-02-18.