0% found this document useful (0 votes)
73 views20 pages

The Best Open-Source AI Models - All Your Free-To-Use Options Explained - ZDNET

The document discusses the evolution and significance of open-source generative AI models, highlighting their advantages such as customization and transparency, while also noting the limitations of proprietary models in certain industries. It introduces the Open Source AI Definition (OSAID) and evaluates various AI models based on their compliance with OSAID standards, categorizing them into compliant, potentially compliant, and non-compliant groups. Additionally, it outlines the requirements for running these models and provides a comprehensive overview of leading open-source AI models across different types, including language, image, vision, and audio models.

Uploaded by

greatnessga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views20 pages

The Best Open-Source AI Models - All Your Free-To-Use Options Explained - ZDNET

The document discusses the evolution and significance of open-source generative AI models, highlighting their advantages such as customization and transparency, while also noting the limitations of proprietary models in certain industries. It introduces the Open Source AI Definition (OSAID) and evaluates various AI models based on their compliance with OSAID standards, categorizing them into compliant, potentially compliant, and non-compliant groups. Additionally, it outlines the requirements for running these models and provides a comprehensive overview of leading open-source AI models across different types, including language, image, vision, and audio models.

Uploaded by

greatnessga
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

zdnet.

com

The best open-source AI models: All


your free-to-use options explained
Written by Jason Perlow, Senior Contributing Writer Nov. 6, 2024 at 3:00 a.m. PT

18–23 minutes

Jackie Niam/Getty Images

Generative AI (Gen AI) has advanced significantly since its public launch
two years ago. The technology has led to transformative applications
that can create text, images, and other media with impressive accuracy
and creativity.

Also: We have an official open-source AI definition now

Open-source generative models are valuable for developers,


researchers, and organizations wanting to leverage cutting-edge AI
technology without incurring high licensing fees or restrictive commercial
policies. Let's find out more.

Open-source vs. proprietary models

Open-source AI models offer several advantages, including


customization, transparency, and community-driven innovation. These
models allow users to tailor them to specific needs and benefit from
ongoing enhancements. Additionally, they typically come with licenses
that permit both commercial and non-commercial use, which enhances
their accessibility and adaptability across various applications.

Also: The best free AI courses in 2024

However, open-source solutions are not always the best choice. In


industries that demand strict regulatory compliance, data privacy, and
specialized support, proprietary models often perform better. They
provide stronger legal frameworks, dedicated customer support, and
optimizations tailored to industry requirements. Closed-source solutions
may also excel in highly specialized tasks, thanks to exclusive features
designed for high performance and reliability.

Newsletters

ZDNET Tech Today

ZDNET's Tech Today newsletter is a daily briefing of the newest, most


talked about stories, five days a week.

Subscribe
See all
When organizations require real-time updates, advanced security, or
specialized functionalities, proprietary models can offer a more robust
and secure solution, effectively balancing openness with the rigorous
demands for quality and accountability.

The Open Source AI Definition

The Open Source Initiative (OSI) recently introduced the Open Source
AI Definition (OSAID) to clarify what qualifies as genuinely open-source
AI. To meet OSAID standards, a model must be fully transparent in its
design and training data, enabling users to recreate, adapt, and use it
freely.

Also: Can AI even be open source? It's complicated

However, some popular models, including Meta's LLaMA and Stability


AI's Stable Diffusion, have licensing restrictions or lack transparency
around training data, preventing full compliance with OSAID.

As part of the OSAID validation process, OSI assessed the following:

• Compliant models: Pythia (Eleuther AI), OLMo (AI2), Amber and


CrystalCoder (LLM360), and T5 (Google).

• Potentially compliant models: Bloom (BigScience), Starcoder2


(BigCode), and Falcon (TII) could meet OSAID standards with minor
adjustments to licensing terms or transparency.

• Non-compliant models: LLaMA (Meta), Grok (X/Twitter), Phi


(Microsoft), and Mixtral (Mistral) lack the necessary transparency or
impose restrictive licensing terms.

The OSAID has sparked notable dissent among prominent open-source


community members. Because it diverges from the traditional open-
source definition used for software, its relevance and impact on open-
source generative AI models have stirred intense debate across
community forums, including the Open Source Definition's bulletin
boards (an alternative organization to the OSI), developer mailing lists,
and public platforms like LinkedIn.

LLaMA and other non-compliant architectures

The Meta LLaMA architecture exemplifies noncompliance with OSAID


due to its restrictive research-only license and lack of full transparency
about training data, limiting commercial use and reproducibility. Derived
models, like Mistral's Mixtral and the Vicuna Team's MiniGPT-4, inherit
these restrictions, propagating LLaMA's noncompliance across
additional projects.

Also: Want to work in AI? How to pivot your career in 5 steps

Beyond LLaMA-based models, other widely used architectures face


similar issues. For example, Stability Diffusion by Stability AI employs the
Creative ML OpenRAIL-M license, which includes ethical restrictions
that deviate from OSAID's requirements for unrestricted use. Similarly,
Grok by xAI combines proprietary elements with usage limitations,
challenging its alignment with open-source ideals.

These examples underscore the difficulty of meeting OSAID's


standards, as many AI developers balance open access with
commercial and ethical considerations.
Implications for organizations: OSAID compliance vs.
non-compliance

Choosing OSAID-compliant models gives organizations transparency,


legal security, and full customizability features essential for responsible
and flexible AI use. These compliant models adhere to ethical practices
and benefit from strong community support, promoting collaborative
development.

In contrast, non-compliant models may limit adaptability and rely more


heavily on proprietary resources. For organizations that prioritize
flexibility and alignment with open-source values, OSAID-compliant
models are advantageous. However, non-compliant models can still be
valuable when proprietary features are required.

Understanding licensing in open-source AI models

Open-source AI models are released under licenses that define usage,


modification, and sharing conditions. While some licenses align with
traditional open-source standards, others incorporate restrictions or
ethical guidelines that prevent full OSAID compliance. Key licenses
include:

• Apache 2.0: A permissive license that allows free use, modification, and
distribution, along with a patent grant. Apache 2.0 is OSI-approved and
popular for open-source projects, providing flexibility and legal
protection.

• MIT: Another permissive license that only requires attribution for reuse.
Like Apache 2.0, MIT is OSI-approved, widely adopted, and offers
simplicity and minimal restrictions.

• Creative ML OpenRAIL-M: A license designed for AI applications,


allowing broad use but imposing ethical guidelines to prevent harmful
use. OpenRAIL-M is not OSI-approved because it includes usage
restrictions that conflict with the OSI's principles of unrestricted freedom.
However, it is valued by developers aiming to prioritize ethical use in AI.

• CC BY-SA: The Creative Commons Share-Alike license permits free


use and requires derivative works to remain open source. While it
encourages open collaboration, it's not OSI-approved and is more
commonly used for content rather than code, as it lacks some flexibility
for software applications.

• CC BY-NC 4.0: A Creative Commons license that permits free use with
attribution but restricts commercial applications. This license, used for
certain model weights (like Meta's MusicGen and AudioGen), limits the
models' usability in commercial environments and does not align with
OSI's open-source standards.

• Custom licenses: Many models on our list, such as IBM's Granite and
Nvidia's NeMo, operate under proprietary or custom licenses. These
models often impose specific conditions for use or modify traditional
open-source terms to align with commercial goals, making them non-
compliant with open-source principles.

• Research-only licenses: Certain models, such as Meta's LLaMA and


Codellama series, are available only under research-use terms. These
licenses restrict use to academic or non-commercial purposes and
prevent broad community-driven projects, as they do not meet OSI's
open-source criteria.

Requirements for running open-source AI models

Running open-source Gen AI models requires specific hardware,


software environments, and toolsets for model training, fine-tuning, and
deployment tasks. High-performance models with billions of parameters
benefit from powerful GPU setups like Nvidia's A100 or H100.

Also: How open source attracts some of the world's top innovators

Essential environments typically include Python and machine learning


libraries like PyTorch or TensorFlow. Specialized toolsets, including
Hugging Face's Transformers library and Nvidia's NeMo, simplify the
processes of fine-tuning and deployment. Docker helps maintain
consistent environments across different systems, while Ollama allows
for the local execution of large language models on compatible systems.

The following chart highlights essential toolsets, recommended


hardware, and their specific functions for managing open-source AI
models:

Toolset Purpose Requirements Use

Python Primary N/A Essential for


programming scripting and
environment configuring
models

PyTorch Model training GPU (e.g., Widely used


and inference Nvidia A100, library for deep
H100) learning models

TensorFlow Model training GPU (e.g., Alternative deep


and inference Nvidia A100, learning library
H100)

Hugging Face Model GPU (preferred) Library for


Transformers deployment accessing, fine-
and fine-tuning tuning, and
deploying
models

Nvidia NeMo Multimodal Nvidia GPUs Optimized for


model support Nvidia hardware
and and multimodal
deployment tasks

Docker Environment Supports GPUs Containerizes


consistency models for easy
and deployment
deployment

Ollama Running large macOS, Linux, Platform to run


language Windows, LLMs locally on
models locally supports GPUs compatible
systems

LangChain Building Python 3.7+ Framework for


applications composing and
with LLMs deploying LLM-
powered
applications

LlamaIndex Connecting Python 3.7+ Framework for


LLMs with integrating LLMs
external data with data sources
sources

This setup establishes a robust framework for efficiently managing Gen


AI models, from experimentation to production-ready deployment. Each
tool set possesses unique strengths, enabling developers to tailor their
environments for specific project needs.

Choosing the right model

Selecting the right gen AI model depends on several factors, including


licensing requirements, desired performance, and specific functionality.
While larger models tend to deliver higher accuracy and flexibility, they
require substantial computational resources. Smaller models, on the
other hand, are more suitable for resource-constrained applications and
devices.

Also: IBM will train you in AI fundamentals for free, and give you a
skill credential - in 10 hours

It's important to note that most models listed here, even those with
traditionally open-source licenses like Apache 2.0 or MIT, do not meet
the Open Source AI Definition (OSAID). This gap is primarily due to
restrictions around training data transparency and usage limitations,
which OSAID emphasizes as essential for true open-source AI.
However, certain models, such as Bloom and Falcon, show potential for
compliance with minor adjustments to their licenses or transparency
protocols and may achieve full compliance over time.

The tables below provide an organized overview of the leading open-


source generative AI models, categorized by type, issuer, and
functionality, to help you choose the best option for your needs, whether
a fully transparent, community-driven model or a high-performance tool
with specific features and licensing requirements.

Language models
Language models are crucial in text-based applications such as
chatbots, content creation, translation, and summarization. They are
fundamental to natural language processing (NLP) and continually
improve their understanding of language structure and context.

Notable models include Meta's LLaMA, EleutherAI's GPT-NeoX, and


Nvidia's NVLM 1.0 family, each known for their unique strengths in
multilingual, large-scale, and multimodal tasks.

Issuer & Model Parameter License Highlights


Sizes

Google T5 Small to Apache 2.0 High-performance


XXL language model,
OSAID Compliant

EleutherAI Various Apache 2.0 Interpretability-


Pythia focused, OSAID
Compliant

Allen Institute Various Apache 2.0 Open language


for AI (AI2) research model,
OLMo OSAID Compliant

BigScience 176B OpenRAIL- Multilingual,


BLOOM M responsible AI, OSAID
Potential

BigCode Various Apache 2.0 Code generation,


Starcoder2 OSAID Potential

TII Falcon 7B, 40B Apache 2.0 Efficient and high-


performance, OSAID
Potential
Issuer & Model Parameter License Highlights
Sizes

AI21 Labs Mini to Custom Language and chat


Jamba Series Large generation

AI Singapore 7B Custom Language and cultural


Sea-Lion representation

Alibaba Qwen 7B Custom Bilingual model


Series (Chinese, English)

Databricks 12B CC BY-SA Open dataset,


Dolly 2.0 3.0 commercial use

EleutherAI 6B Apache 2.0 General-purpose


GPT-J language model

EleutherAI 20B MIT Large-scale text


GPT-NeoX generation

Google 2B, 9B, 27B Apache 2.0 Language and code


Gemma 2 generation

IBM Granite 3B, 8B Apache 2.0 Summarization,


Series classification, RAG

Meta LLaMA 1B to 405B Research- Advanced NLP,


3.2 only multilingual

Microsoft Phi-3 Mini to MIT Reasoning, cost-


Series Medium effective

Mistral AI 8x22B Apache 2.0 Sparse model, efficient


Mixtral 8x22B reasoning
Issuer & Model Parameter License Highlights
Sizes

Mistral AI 7B Apache 2.0 Dense, multilingual text


Mistral 7B generation

Nvidia NVLM 72B CC by SA High-performance


1.0 Family 3.0 multimodal LLM

Rakuten 7B Custom Multilingual chat, NLP


RakutenAI
Series

xAI Grok-1 314B Apache 2.0 Large-scale language


model

Image generation models

Image generation models create high-quality visuals or artwork from text


prompts, which makes them invaluable for content creators, designers,
and marketers.

Stability AI's Stable Diffusion is widely adopted due to its flexibility and
output quality, while DeepFloyd's IF emphasizes generating realistic
visuals with an understanding of language.

Issuer & Model Parameter License Highlights


Sizes

Stability AI 2.5B to 8B OpenRAIL- High-quality image


Stable M synthesis
Diffusion 3.5
Issuer & Model Parameter License Highlights
Sizes

DeepFloyd IF 400M to Custom Realistic visuals with


4.3B language
comprehension

OpenAI DALL- Not Custom State-of-the-art text-to-


E3 disclosed image synthesis

Google Imagen Not Custom High-fidelity image


disclosed generation from text

Midjourney Not Custom Artistic and stylized


disclosed image generation

Adobe Firefly Not Custom Integrated AI image


disclosed generation within
Adobe products

Vision models

Vision models analyze images and videos, supporting object detection,


segmentation, and visual generation from text prompts.

Also: How Claude's new AI data analysis tool compares to ChatGPT's


version (hint: it doesn't)

These technologies benefit several industries, including healthcare,


autonomous vehicles, and media.

Issuer & Parameter License Highlights


Model Sizes

Meta SAM 2.1 38.9M to Apache 2.0 Video editing,


Issuer & Parameter License Highlights
Model Sizes
224.4M segmentation

NVIDIA Not Custom Character consistency


Consistency disclosed across video frames

NVIDIA Not Custom Medical imaging,


VISTA-3D disclosed anatomical
segmentation

NVIDIA NV- Not Non- Image embedding


DINOv2 disclosed commercial generation

Google Not Apache 2.0 High-quality semantic


DeepLab disclosed image segmentation

Microsoft 0.23B, MIT General-purpose visual


Florence 0.77B model for computer
vision

OpenAI CLIP 400M MIT Text and image


comprehension

Audio models

Audio models process and generate audio data, enabling speech


recognition, text-to-speech synthesis, music composition, and audio
enhancement.

Issuer & Model Sizes License Highlights

Coqui.ai TTS N/A MPL 2.0 Text-to-speech


synthesis, multi-
Issuer & Model Sizes License Highlights
language support

ESPnet ESPnet N/A Apache End-to-end speech


2.0 processing toolkit

Facebook AI Base Apache Self-supervised


wav2vec 2.0 (95M), 2.0 speech recognition
Large
(317M)

Hugging Face Various Apache Collection of ASR


Transformers 2.0 and TTS models
(Speech Models)

Magenta MusicVAE N/A Apache Music generation


2.0 and interpolation

Meta MusicGen N/A MIT / CC Music generation


BY-NC 4.0 from text prompts

Meta AudioGen N/A MIT / CC Sound effect


BY-NC 4.0 generation from text
prompts

Meta EnCodec N/A MIT / CC High-quality audio


BY-NC 4.0 compression

Mozilla N/A MPL 2.0 End-to-end speech-


DeepSpeech to-text engine

NVIDIA NeMo Various Apache ASR and TTS


(Speech Models) 2.0 models optimized for
Nvidia GPUs
Issuer & Model Sizes License Highlights

OpenAI Jukebox N/A MIT Neural music


generation with
genre/artist
conditioning

OpenAI Whisper 39M to MIT Multilingual speech


1.6B recognition and
transcription

TensorFlow TFLite N/A Apache Speech recognition


Speech Models 2.0 models optimized for
mobile devices

Multimodal models

Multimodal models combine text, images, audio, and other data types to
create content from various inputs.

Also: How AI hallucinations could help create life-saving antibiotics

These models are effective in applications requiring language, visual,


and sensory understanding.

Model Name Parameter License Highlights


Sizes

Allen Institute 1B, 70B Apache A multimodal AI model


for AI (AI2) 2.0 that processes text and
Molmo visual inputs, OSAID-
compliant

Meta ImageBind N/A Custom Integrates six data types:


text, images, audio,
Model Name Parameter License Highlights
Sizes
depth, thermal, and IMU.

Meta N/A Custom Provides multilingual


SeamlessM4T translation and
transcription services.

Meta Spirit LM N/A Custom Combines text and


speech to produce
natural-sounding
outputs.

Microsoft 0.23B, MIT Handles computer vision


Florence-2 0.77B and language tasks
proficiently.

NVIDIA VILA N/A Custom Processes vision-


language tasks
effectively.

OpenAI CLIP 400M MIT Excels in text and image


comprehension.

Vicuna Team 13B Apache Capable of


MiniGPT-4 2.0 understanding both text
and images.

Retrieval-augmented generation (RAG)

RAG models merge generative AI with information retrieval, allowing


them to incorporate relevant data from extensive datasets into their
responses.
Issuer & Model Parameter License Highlights
Sizes

BAAI BGE-M3 N/A Custom Dense and sparse


retrieval optimization

IBM Granite 3.0 3B, 8B Apache Advanced retrieval,


Series 2.0 summarization, RAG

Nvidia 1B Custom Multilingual QA, GPU-


EmbedQA & accelerated retrieval
ReRankQA

Specialized models

Specialized models are optimized for specific fields, such as


programming, scientific research, and healthcare, offering enhanced
functionality tailored to their domains.

Issuer & Model Parameter License Highlights


Sizes

Meta Codellama 7B, 13B, Custom Code generation,


Series 34B multilingual
programming

IBM Granite 3B, 8B, 20B, Apache Code generation, time


(Specialized 34B 2.0 series, geospatial
Models)

Mistral AI Mamba- 7B Apache Focused on coding


Codestral 2.0 and multilingual
capabilities
Issuer & Model Parameter License Highlights
Sizes

Mistral AI 7B Apache Specialized in


Mathstral 2.0 mathematical
reasoning

Guardrail models

Guardrail models ensure safe and responsible outputs by detecting and


mitigating biases, inappropriate content, and harmful responses.

Issuer & Model Parameter License Highlights


Sizes

NVIDIA NeMo N/A Apache Open-source toolkit for


Guardrails 2.0 adding programmable
guardrails

Google 2B, 9B, 27B Custom Safety classifier models


ShieldGemma built on Gemma 2

IBM Granite- 8B Apache Detects unethical or


Guardian 2.0 harmful content

Choose open-source models

The landscape of generative AI is evolving rapidly, with open-source


models crucial for making advanced technology accessible to all. These
models allow for customization and collaboration, breaking down
barriers that have limited AI development to large corporations.

Also: 4 ways to turn generative AI experiments into real business value

Developers can tailor solutions to their needs by choosing open-source


Gen AI, contributing to a global community, and accelerating
technological progress. The variety of available models -- from language
and vision to safety-focused designs -- ensures options for almost any
application.

Supporting open-source AI communities will be essential for promoting


ethical and innovative AI developments, benefiting individual projects,
and advancing technology responsibly.

Artificial Intelligence

You might also like