Sparsity-aware deep learning inference runtime for CPUs
An easy-to-use LLMs quantization package with user-friendly apis
Large Language Model Text Generation Inference
Openai style api for open large language models
Libraries for applying sparsification recipes to neural networks
Neural Network Compression Framework for enhanced OpenVINO
Efficient few-shot learning with Sentence Transformers
A Unified Library for Parameter-Efficient Learning