-
llm-export Public
Forked from wangzhaode/llm-exportllm-export can export llm model to onnx.
Python Apache License 2.0 UpdatedOct 9, 2024 -
-
Awesome-LLM-Inference Public
Forked from DefTruth/Awesome-LLM-Inference📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
GNU General Public License v3.0 UpdatedAug 1, 2024 -
timeflies Public
Forked from focusunsink/timefliesCompute the time of Model
Python UpdatedJul 10, 2024 -
-
iree Public
Forked from iree-org/ireeA retargetable MLIR-based machine learning compiler and runtime toolkit.
C++ Apache License 2.0 UpdatedMar 27, 2023 -
cutlass Public
Forked from NVIDIA/cutlassCUDA Templates for Linear Algebra Subroutines
C++ Other UpdatedFeb 28, 2023 -
xbyak Public
Forked from herumi/xbyaka JIT assembler for x86(IA-32)/x64(AMD64, x86-64) MMX/SSE/SSE2/SSE3/SSSE3/SSE4/FPU/AVX/AVX2/AVX-512 by C++ header
C++ BSD 3-Clause "New" or "Revised" License UpdatedSep 8, 2022 -
-