0% found this document useful (0 votes)
6 views21 pages

Accurately Automating Dataset Labeling Using Amazon SageMaker Ground Truth - Will Badr

The document discusses the importance of data labeling for machine learning and introduces Amazon SageMaker Ground Truth as a solution for automating this process. It highlights the challenges of manual labeling and presents features such as automatic labeling, customizable workflows, and human workforce management. The document also outlines the benefits of using Ground Truth, including reduced costs and increased accuracy in labeled datasets.

Uploaded by

Sentinel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views21 pages

Accurately Automating Dataset Labeling Using Amazon SageMaker Ground Truth - Will Badr

The document discusses the importance of data labeling for machine learning and introduces Amazon SageMaker Ground Truth as a solution for automating this process. It highlights the challenges of manual labeling and presents features such as automatic labeling, customizable workflows, and human workforce management. The document also outlines the benefits of using Ground Truth, including reduced costs and increased accuracy in labeled datasets.

Uploaded by

Sentinel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Accurately automating dataset labeling

using Amazon SageMaker Ground Truth


Will Badr
Senior Technical Account Manager, AWS

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
• Motivation for data labeling

• Introduction to Amazon SageMaker Ground Truth

• End-to-end walkthrough of Ground Truth labeling jobs

• Demo

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Why is data labeling necessary?

Input data Neural network model & algorithm Prediction

“Five”

“Two”

“Three”

“Seven”

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Why is data labeling challenging?

ML models need large labeled datasets.

Large number of humans need to label


data.

Difficult to achieve high accuracy for


labels.

Consumes up to 80% of time to deploy


ML.

Source: https://towardsdatascience.com/semantic-segmentation-with-deep-learning-a-guide-and-code-e52fc8958823

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon SageMaker: Build, train, and deploy ML

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon SageMaker Ground Truth
Label machine learning training data easily and accurately

KEY FEATURES
Automatic labeling Ready-made and Private and public Label
via machine learning custom workflows human workforce management

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
For what purposes is data labeling used?

 Text analysis
 Precision agriculture
 Manufacturing efficiency
 Food safety
 Self-driving cars
 Inventory cataloging
… and many more use cases

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Use pre-built labeling workflows or set up your own

Text classification Semantic segmentation Object detection

Custom template Image classification


© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Human labeling

Public crowd through Internal workforce Third-party


Mechanical Turk vendors

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Automated data labeling and active learning

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How does it work?

Input
Dataset
Active
from S3
learning

Input dataset from S3 1

Human labeling
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How does it work?

Input
Dataset
Active
from S3
learning

1 2
Input dataset from S3

Label
consolidation
algorithm

Human labeling
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How does it work? Train and
deploy DL
models

Input
Dataset
Active
from S3
learning

Labeled dataset
1 2
Input dataset from S3

Label
consolidation
algorithm

Human labeling
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How does it work? Train and
deploy DL
models

4
Auto labeling 3

High-confidence predictions?

Input > 95%


Dataset 5 Yes
Active
from S3
learning

Labeled dataset
1 2
Input dataset from S3

Label
consolidation
algorithm

Human labeling
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How does it work? Train and
deploy DL
models

4
Auto labeling 3

High-confidence predictions?

Input > 95%


Dataset 5 Yes
Active
from S3
learning
No Labeled dataset
1 2
Input dataset from S3

Label
consolidation
algorithm

Human labeling
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Demo

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What does this mean for you?

Lower the total Securely manage Increase the


cost of data datasets accuracy of
labeling by 70% datasets

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Learn from AWS experts. Advance your skills and
knowledge. Build your future in the AWS Cloud.

Digital Training Classroom Training AWS Certification


Free, self-paced online Classes taught by accredited Exams to validate expertise
courses built by AWS AWS instructors with an industry-recognized
experts credential
Ready to begin building your cloud skills?
Get started at: https://www.aws.training/

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why work with an APN Partner?
APN Partners are uniquely positioned APN Partners with deep expertise in
to help your organization at any AWS services:
stage of your cloud adoption journey, AWS Managed Service Provider (MSP)
and they:
Partners
• Share your goals—focused on your APN Partners with cloud infrastructure and
success application migration expertise

• Help you take full advantage of all the AWS Competency Partners
business benefits that AWS has to offer APN Partners with verified, vetted, and validated
specialized offerings
• Provide services and solutions to
support any AWS use case across your AWS Service Delivery Partners
full customer life cycle APN Partners with a track record of delivering
specific AWS services to customers

Find the right APN Partner for your needs: https://aws.amazon.com/partners/find/

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you for attending AWS Innovate
We hope you found it interesting! A kind reminder to complete the survey.
Let us know what you thought of today’s event and how we can improve the event
experience for you in the future.

[email protected]
twitter.com/AWSCloud
facebook.com/AmazonWebServices
youtube.com/user/AmazonWebServices
slideshare.net/AmazonWebServices
twitch.tv/aws

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

You might also like