Featured Posts
Bridging the Last Mile: Deploying Hummingbird-XT for Efficient Video Generation on AMD Consumer-Grade Platforms
Learn how to use Hummingbird-XT and Hummingbird-XTX modelS to generate videos. Explore the video diffusion model acceleration solution, including dit distillation method and light VAE model.
Scaling AI Inference Performance with vLLM on AMD Instinct MI355X GPUs
Explore how MI355X performs against B200 in vLLM benchmarks across DeepSeek-R1, GPT-OSS-120B, Qwen3-235B and Llama-3.3-70B.
AMD Enterprise AI Suite: Open Infrastructure for Production AI
Explore an open, GPU-optimized platform to build, deploy, and scale enterprise AI workloads on AMD Instinct with production-ready performance.
ROCm 7.9 Technology Preview: ROCm Core SDK and TheRock Build System
Introduce ROCm Core SDK, and learn to install and build ROCm components easily using TheRock.
Applying Compute Partitioning for Workloads on MI300X GPUs
Learn how to boost MI300X performance using GPU Compute partitioning for parallel workloads like GROMACS and REINVENT
Reimagining GPU Allocation in Kubernetes: Introducing the AMD GPU DRA Driver
Explore how the AMD GPU DRA Driver brings declarative, attribute-aware GPU scheduling to Kubernetes — learn how to request and manage GPUs natively
Installing AMD HIP-Enabled GROMACS on HPC Systems: A LUMI Supercomputer Case Study
Installing AMD HIP-Enabled GROMACS on HPC Systems: A LUMI Supercomputer Case Study
Athena-PRM: Enhancing Multimodal Reasoning with Data-Efficient Process Reward Models
Learn how to utilize a data-efficient Process Reward Model to enhance the reasoning ability of the Large Language/Multimodal Models.
Accelerating llama.cpp on AMD Instinct MI300X
Learn more about the superior performance of llama.cpp on Instinct platforms.
Democratizing AI Compute with AMD Using SkyPilot
Learn how SkyPilot integrates with AMD open AI stack to enable seamless multi-cloud deployment and simplifies NVIDIA-to-AMD GPU migration.
Continuing the Momentum: Refining ROCm For The Next Wave Of AI and HPC
ROCm 7.1 builds on 7.0’s AI and HPC advances with faster performance, stronger reliability, and streamlined tools for developers and system builders.
ROCm 7.0: An AI-Ready Powerhouse for Performance, Efficiency, and Productivity
Discover how ROCm 7.0 integrates AI across every layer, combining hardware enablement, frameworks, model support, and a suite of optimized tools
Accelerating IBM Granite 4.0 with FP8 using AMD Quark on MI300/MI355 GPUs
Learn how AMD Instinct MI355 Series GPUs deliver competitive Granite 4.0 inference with faster TTFT, lower latency, and strong throughput.
Using Gradient Boosting Libraries on MI300X for Financial Risk Prediction
This blog shows how to run LightGBM and ThunderGBM GPU-accelerated training on AMD Instinct MI300X GPUs with ROCm for finance focused workloads.
High-Resolution Weather Forecasting with StormCast on AMD Instinct GPU Accelerators
A showcase for how to run high-resolution weather prediction models such as StormCast on AMD Instinct hardware.
Breaking the Accuracy-Speed Barrier: How MXFP4/6 Quantization Revolutionizes Image and Video Generation
Explore how MXFP4/6, supported by AMD Instinct™ MI350 series GPUs, achieves BF16-comparable image and video generation quality.
Introducing the AMD Network Operator v1.0.0: Simplifying High-Performance Networking for AMD Platforms
Introducing the AMD Network Operator for automating high-performance AI NIC networking in Kubernetes for AI and HPC workloads
Accelerating Multimodal Inference in vLLM: The One-Line Optimization for Large Multimodal Models
Learn how to optimize multimodal model inference with batch-level data parallelism for vision encoders in vLLM, achieving up to 45% throughput gains on AMD MI300X.
GEAK HIP: Expanding GEAK for HIP Code Optimization
Explore the GEAK frameworks AI-driven HIP code optimization for improved performance on AMD GPUs, including speedup examples and benefits for AI workloads.
Getting Started with AMD AI Workbench: Deploying and Managing AI Workloads
Learn how to deploy and manage AI workloads with AMD AI Workbench, a low-code interface for developers to manage AI inference deployments
Stay informed
- Subscribe to our RSS feed (Requires an RSS reader available as browser plugins.)
- Signup for the ROCm newsletter
- View our blog statistics
- View the ROCm Developer Hub
- Report an issue or request a feature
- We are eager to learn from our community! If you would like to contribute to the ROCm Blogs, please submit your technical blog for review at our GitHub. Blog creation can be started through our GitHub user guide.