0% found this document useful (0 votes)
30 views

Module 4

Uploaded by

maha.kandadai
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Module 4

Uploaded by

maha.kandadai
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Accelerate AI/ML Training and Inference

Intro to Neuron SDK

Lu Zou
Sr. Partner Solutions Architect, AI/ML
Amazon Web Services

© 2024, Amazon Web Services, Inc. or its affiliates. All rights © 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
reserved.
Getting started with AWS Trainium and
AWS Inferentia
Launch Integrations A few lines of Monitor,
instances code  run tune, scale
(Trn1, Inf2, Inf1)

Neuron SDK Neuron SDK


AWS Deep
Learning AMIs

AWS Deep Learning


Containers

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Neuron Getting Started: https://bityl.co/IcJO
Neuron SDK Overview Neuron SDK version 2.20

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Neuron SDK Overview Neuron SDK version 2.20

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Neuron SDK Overview Neuron SDK version 2.20

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Neuron SDK Overview Neuron SDK version 2.20

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Neuron: SDK compiler
THE AWS NEURON COMPILER

Graph optimizations Loop optimizations


(hardware agnostic) (layout, tiling, vectorization, pipelining)

Scheduling and allocation Hardware intrinsic mapping


(working set minimization, latency hiding)

z = matmul_128x128(x,y)

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Neuron Kernel Interface (NKI) Neuron SDK version 2.20

Graph optimizations
HIGH PERFORMANCE USER-DEFINED KERNELS
(hardware agnostic)

Loop optimizations
(layout, tiling, vectorization, pipelining)

Hardware intrinsic mapping

z =
matmul_128x128(
x,y)

Scheduling and allocation


(working set minimization, latency hiding)

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Neuron SDK Overview Neuron SDK version 2.20

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Monitoring and visualization
Neuron
Neuron ls
Top

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Monitoring and visualization
Neuron Profiler
$ neuron-profile capture -n file.neff -s profile.ntff
$ neuron-profile view -n file.neff -s profile.ntff

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Monitor workloads through AWS
CloudWatch
NEURON MONITOR CONTAINER SEAMLESS INTEGRATION WITH CONTAINERIZED NEURON APPLICATIONS

• Easy integration with Prometheus and Grafana, simplifying


the monitoring process
• Neuron Monitor integrated into CloudWatch EKS Container
Insights, enabling automatic discovery of critical health
metrics from Trainium and Inferentia instances
• Visualize metrics through dashboards that allow visibility
through different layers of the cluster providing Neuron
metrics, but also correlated with the other aspects of their
infrastructure including EFA, CPU, Memory, Network,
Filesystem and Kubernetes itself.

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
More Ways to Get Started
Amazon
Use Case Start With Resources
Service
Bring your own models and • PyTorch Neuron • Learn
training/Inference script to deploy via DLC / Tensorflow • Blog, Blog
managed ML service Inferentia DLC • Hands On
• Tutorial
Amazon • Sagemaker examples
SageMaker Models from Hugging Face: • Hugging Face and • Example Notebook – 1
Managed Fine-tune and deploy opensource SageMaker • Example Notebook - 2
Service models from Hugging Face directly on
SageMaker

Amazon Fine-tune and deploy a selection of fine- • SageMaker • Learn


tuned models • LLAMA-3 Blog
SageMaker • Hands On
Jumpstart • LLAMA-2 example Notebook

Models from Hugging Face: • Hugging Face and


SageMaker • Hands On
Fine-tune and deploy opensource • Tutorial
models from Hugging Face directly on
EC2 using Optimum Neuron library on
Self – EC2 DLAMI
Managed Trn1/Inf2
Pre-train, fine-tune, or deploy any • Neuron DLAMI • Hands On
supported model on EC2 using DLC, • Tutorials
EKS • Workshop
• neuronx-distributed

© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Neuron Getting Started
https://bityl.co/IcJO

Thank you!

Sample Codes Training and Inference


Performance

© 2024, Amazon Web Services, Inc. or its affiliates. All rights


Amazon © Web2024, Amazon
Services, Web Services,
reserved.
AWS, the Powered Inc. logo,
by AWS or itsand
affiliates.
all AWS All rights reserved.
service names used in this slide deck are trademarks of Amazon.com,
Inc. or its affiliates.

You might also like