-
Alibaba
- HangZhou
-
12:11
(UTC +08:00) - http://wangfakang.github.io
Lists (3)
Sort Name ascending (A-Z)
Stars
Currently, it's another gateway based on Istio/Envoy. (TODO: give it a better description)
MSCCL++: A GPU-driven communication stack for scalable AI applications
oneAPI Collective Communications Library (oneCCL)
DeepLearning Framework Performance Profiling Toolkit
The simplest, fastest repository for training/finetuning medium-sized GPTs.
A library to analyze PyTorch traces.
Alveo Collective Communication Library: MPI-like communication operations for Xilinx Alveo accelerators
Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)
TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches
Open-source observability for your LLM application, based on OpenTelemetry
Collective communications library with various primitives for multi-machine training.
awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.
A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
Ongoing research training transformer models at scale
Blink+: Increase GPU group bandwidth by utilizing across tenant NVLink.
Tools for monitoring NVIDIA GPUs on Linux
Infrastructure Programmer Development Kit (IPDK) is an open source, vendor agnostic framework of drivers and APIs for infrastructure offload and management that runs on a CPU, IPU, DPU or switch.
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Grumpy is a Python to Go source code transcompiler and runtime.
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.