Skip to content
View wangfakang's full-sized avatar

Organizations

@envoyproxy

Block or report wangfakang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

Currently, it's another gateway based on Istio/Envoy. (TODO: give it a better description)

Go 65 17 Updated Sep 17, 2024

MSCCL++: A GPU-driven communication stack for scalable AI applications

C++ 233 30 Updated Sep 18, 2024

oneAPI Collective Communications Library (oneCCL)

C++ 189 67 Updated Aug 22, 2024

DeepLearning Framework Performance Profiling Toolkit

Python 276 27 Updated Mar 28, 2022

mperf是一个面向移动/嵌入式平台的算子性能调优工具箱

C++ 169 26 Updated Aug 17, 2023

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 36,238 5,664 Updated Aug 19, 2024

A library to analyze PyTorch traces.

Python 271 37 Updated Sep 7, 2024

ROCm Communication Collectives Library (RCCL)

C++ 251 113 Updated Sep 17, 2024

Alveo Collective Communication Library: MPI-like communication operations for Xilinx Alveo accelerators

C++ 81 26 Updated Aug 16, 2024
Python 76 35 Updated Dec 11, 2019

NCCL Profiling Kit

Python 104 11 Updated Jul 1, 2024

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)

C 1,116 418 Updated Sep 13, 2024

TACCL: Guiding Collective Algorithm Synthesis using Communication Sketches

Python 54 7 Updated Jul 25, 2023

Open-source observability for your LLM application, based on OpenTelemetry

Python 1,838 151 Updated Sep 16, 2024

Collective communications library with various primitives for multi-machine training.

C++ 1,192 299 Updated Jun 26, 2024

awesome llm plaza: daily tracking all sorts of awesome topics of llm, e.g. llm for coding, robotics, reasoning, multimod etc.

125 9 Updated Sep 16, 2024

Unified Collective Communication Library

C 190 93 Updated Sep 17, 2024

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology

C++ 854 143 Updated Jul 8, 2024

Ongoing research training transformer models at scale

Python 9,967 2,252 Updated Sep 18, 2024

A PyTorch Native LLM Training Framework

Python 580 28 Updated Aug 25, 2024

Blink+: Increase GPU group bandwidth by utilizing across tenant NVLink.

Jupyter Notebook 5 2 Updated Jun 22, 2022

选字验证码破解,试验过网易和极验,破解率99

C 649 238 Updated Dec 17, 2020

Tools for monitoring NVIDIA GPUs on Linux

C 1,014 301 Updated Nov 2, 2021

Infrastructure Programmer Development Kit (IPDK) is an open source, vendor agnostic framework of drivers and APIs for infrastructure offload and management that runs on a CPU, IPU, DPU or switch.

Shell 183 68 Updated Feb 2, 2024

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Python 14,161 2,224 Updated Aug 31, 2024

Grumpy is a Python to Go source code transcompiler and runtime.

Go 10,542 649 Updated Jan 18, 2022

Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.

Python 27,978 3,352 Updated Sep 16, 2024

GPUDirect Async support for IB Verbs

C++ 88 13 Updated Nov 10, 2022

High-performance, GPU-aware communication library

C++ 85 21 Updated Jul 22, 2024
Next