0% found this document useful (0 votes)

190 views5 pages

MoveNet SinglePose Model Card

This document summarizes two variants of the MoveNet.SinglePose model: Lightning and Thunder. Lightning runs faster than 50 FPS while Thunder has higher accuracy but still runs faster than 30 FPS. Both models take an RGB image and predict the locations and confidence scores of 17 human joints. The models were trained on COCO Keypoint and Active datasets and evaluated on subsets of each, showing mostly consistent performance across gender, age and skin tone groups. Benchmarking found Lightning runs around 18-39 ms and Thunder 22-64 ms on various hardware.

Uploaded by

Aki Sora

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

190 views5 pages

MoveNet SinglePose Model Card

Uploaded by

Aki Sora

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

MoveNet.

SinglePose
Model Details
A convolutional neural network model that runs on RGB images and predicts human joint
locations of a single person. The model is designed to be run in the browser using Tensorflow.js
or on devices using TF Lite in real-time, targeting movement/fitness activities. Two variants
are presented:
● MoveNet.SinglePose.Lightning: A lower capacity model that can run >50FPS on most
modern laptops while achieving good performance.

● MoveNet.SinglePose.Thunder: A higher capacity model that performs better prediction

quality while still achieving real-time (>30FPS) speed. Naturally, thunder will lag behind
the lightning, but it will pack more of a punch.

Model Specifications
Model Architecture
MobileNetV2 image feature extractor with Feature Pyramid Network decoder (to stride of 4)
followed by CenterNet prediction heads with custom post-processing logic. Lightning uses
depth multiplier 1.0 while Thunder uses depth multiplier 1.75.

Inputs
A frame of video or an image, represented as an int32 tensor of shape: 192x192x3(Lightning) /
256x256x3(Thunder). Channels order: RGB with values in [0, 255].

Outputs
A float32 tensor of shape [1, 1, 17, 3].
● The first two channels of the last dimension represents the yx coordinates (normalized to
image frame, i.e. range in [0.0, 1.0]) of the 17 keypoints (in the order of: [nose, left eye,
right eye, left ear, right ear, left shoulder, right shoulder, left elbow, right elbow, left wrist,
right wrist, left hip, right hip, left knee, right knee, left ankle, right ankle]).

● The third channel of the last dimension represents the prediction confidence scores of
each keypoint, also in the range [0.0, 1.0].

Authors
(Equal contributions)
Francois Beletti, Google Yu-Hui Chen, Google
Ard Oerlemans, Google Ronny Votel, Google

Licensed Under Apache License, Version 2.0

Intended Use

Primary Intended Uses

● Optimized to be run in the browser environment using Tensorflow.js with WebGL support
or on-device with TF Lite.
● Tuned to be robust on detecting fitness/fast movement with difficult poses and/or
motion blur.
● Most suitable for detecting the pose of a single person who is 3ft ~ 6ft away from a
device’s webcam that captures the video stream.
● Focus on detecting the pose of the person who is closest to the image center and ignore
the other people who are in the image frame (i.e. background people rejection).
● The model predicts 17 human keypoints of the full body even when they are occluded.
For the keypoints which are outside of the image frame, the model will emit low
confidence scores. A confidence threshold (recommended default: 0.3) can be used to
filter out unconfident predictions.

Primary Intended Users

● People who build applications (e.g. fitness/physical movement, AR entertainment) that
require very fast inference and good quality single-person pose detection (with
background people rejection) on standard consumer devices (e.g. laptops, tablets, cell
phones).

Out-of-scope Use Cases

● This model is not intended for detecting poses of multiple people in the image.
● Any form of surveillance or identity recognition is explicitly out of scope and not enabled
by this technology.
● The model does not store/use/send any information in the input images at inference
time.

Evaluation Data
● COCO Keypoint Dataset Validation Set 2017: In-the-wild images with diverse scenes,
instance sizes, and occlusions. The original validation set contains 5k images (images,
annotations) in total. The images which contain a single person are retained to be the
evaluation set of this model, in total of 919 images. The dataset is chosen to evaluate
the model performance in the general in-the-wild scenario.
● Active Dataset Evaluation Set: Images sampled from YouTube fitness, yoga, and
dance videos which captures people movements. It contains diverse poses and motion
with more motion blur and self-occlusions. The set contains 1161 images with a single
person in the frame. This dataset is chosen to evaluate the model performance on the
targeted domain, i.e. fitness/human motion.
Training Data
● COCO Keypoint Dataset Training Set 2017: In-the-wild images with diverse scenes,
instance sizes, and occlusions. The original training set contains 64k images (images,
annotations). The images with three or more people were filtered out, resulting in a 28k
final training set.
● Active Dataset Training Set: Images sampled from YouTube fitness videos which
captures people exercising (e.g. HIIT, weight-lifting, etc.), stretching, or dancing. It
contains diverse poses and motion with more motion blur and self-occlusions. The set of
images with a single person contains 23.5k images.

Factors

Groups
To perform fairness evaluation, we analyze the model performance under different person
attributes and categories:
● Gender: Male/Female
● Age: Young/Middle-age/Old
● Skin tone: Medium/Darker/Lighter

Instrumentation
The training dataset images were captured in a real-world environment with different light, noise,
and motion. Therefore, the model is robust to the input video streams that are captured through
common devices’ webcams.

Environments
The model is trained on images with various lighting, noise, motion conditions and with diverse
augmentations.

Metrics
● Keypoint mean average precision (mAP) with Object Keypoint Similarity (OKS):
this is the standard metric used to evaluate the quality of the predictions of a keypoint
model in the COCO competition.
● Inference Time: the time spent to run the model inference for a single image measured
in milliseconds.
Quantitative Analyses

Prediction Quality
The following tables show the evaluation result for different attributes/categories. Both models
perform fairly (< 5% performance differences between categories) on our targeted Active Single
Person Image Set.

COCO Val2017 Single Person Image Set

Gender % dataset Keypoint mAP (Lightning) Keypoint mAP (Thunder)
Male 63.1 67.4 78.7
Female 36.9 65.4 76.6

Age % dataset Keypoint mAP (Lightning) Keypoint mAP (Thunder)

Young 72.2 65.6 76.6
Middle-age 17.1 68.0 78.0
Old 10.7 72.1 81.5

Skin Tone % dataset Keypoint mAP (Lightning) Keypoint mAP (Thunder)

Darker 26.8 60.5 74.4
Medium 4.0 61.2 73.7
Lighter 69.2 74.4 82.9

Active Single Person Image Set

Gender % dataset Keypoint mAP (Lightning) Keypoint mAP (Thunder)
Male 46.0 90.2 93.7
Female 54.0 87.8 92.3

Age % dataset Keypoint mAP (Lightning) Keypoint mAP (Thunder)

Young 87.6 89.1 93.3
Middle-age 10.5 89.3 91.5
Old 1.9 85.7 90.0

Skin Tone % dataset Keypoint mAP (Lightning) Keypoint mAP (Thunder)

Darker 15.4 89.1 93.1
Medium 2.5 92.2 93.3
Lighter 82.1 92.9 95.4
Speed Benchmark
The model was benchmarked using the TensorFlow.js benchmark tool in the Chrome browser
on a few systems with either integrated or dedicated GPU. During the test, we enabled the
“pack depthwise” option and took the median of the inference time over 500 runs.

Hardware GPU Inference Time Inference Time

Description (Lightning) (Thunder)

ThinkPad v7 Intel UHD 620 39.0 ms 64.0 ms

gLinux laptop Graphics

Macbook pro Intel Iris Plus 17.8 ms 33.8 ms

2019 13” Graphics 655 1.5 GB

Macbook pro Intel UHD 630 18.7 ms 44.6 ms

2019 15” Graphics

Macbook pro Radeon Pro 555X 16.4 ms 22.1 ms

2019 15” 4 GB

Lenovo P520 GeForce RTX 2080Ti 10.5 ms 15.0 ms

2018

CadnaA Reference 2024MR1
No ratings yet
CadnaA Reference 2024MR1
1,410 pages
The Web Analytics Kick Start Guide A Primer On The Fundamentals of Web Analytics (Brent Dykes)
No ratings yet
The Web Analytics Kick Start Guide A Primer On The Fundamentals of Web Analytics (Brent Dykes)
154 pages
Chapter 10 - Trustworthy System For Safe and Private Healthcare
No ratings yet
Chapter 10 - Trustworthy System For Safe and Private Healthcare
28 pages
AVEVA E3D 2.1 Electrical & Instrumentation
0% (1)
AVEVA E3D 2.1 Electrical & Instrumentation
9 pages
Forecast Study
0% (1)
Forecast Study
234 pages
TechBridge TCP ServiceNow Business Case - Group 6
100% (1)
TechBridge TCP ServiceNow Business Case - Group 6
9 pages
Body Pose Estimation Using Deep Learning
No ratings yet
Body Pose Estimation Using Deep Learning
8 pages
All You Need To Know About Backend Development
No ratings yet
All You Need To Know About Backend Development
28 pages
Yoga Pose Detection Using Machine Learning
No ratings yet
Yoga Pose Detection Using Machine Learning
11 pages
Safeguards For Remote Coders
100% (1)
Safeguards For Remote Coders
4 pages
Meshroom Manual Readthedocs Io en v19.01.45
No ratings yet
Meshroom Manual Readthedocs Io en v19.01.45
107 pages
Year End 2023 Tourism Measures 1
No ratings yet
Year End 2023 Tourism Measures 1
33 pages
The Boards Role in Building Resilience
No ratings yet
The Boards Role in Building Resilience
7 pages
Human Pose Detection
No ratings yet
Human Pose Detection
7 pages
SoftOne BlackBook ENG - v.1.0
No ratings yet
SoftOne BlackBook ENG - v.1.0
321 pages
Emptech Module 4
No ratings yet
Emptech Module 4
25 pages
Human Pose Estimation Using Machine Learning in Python
No ratings yet
Human Pose Estimation Using Machine Learning in Python
5 pages
Effort Estimation
No ratings yet
Effort Estimation
57 pages
A Proposed Model For Fainting People Detection Using Media Pipe Technology
No ratings yet
A Proposed Model For Fainting People Detection Using Media Pipe Technology
4 pages
Android Mobile Application Development
No ratings yet
Android Mobile Application Development
43 pages
Social Media Analytics For User Behavior Modeling A Task Heterogeneity Perspective Data-Enabled Engineering - Arun Reddy Nelakurthi, Jingrui He
No ratings yet
Social Media Analytics For User Behavior Modeling A Task Heterogeneity Perspective Data-Enabled Engineering - Arun Reddy Nelakurthi, Jingrui He
115 pages
Orca 2: Teaching Small Language Models How To Reason
No ratings yet
Orca 2: Teaching Small Language Models How To Reason
53 pages
Future of Fitness App With Artificial in
No ratings yet
Future of Fitness App With Artificial in
7 pages
Real-Time Workout Posture Correction Using OpenCV and MediaPipe (2022)
No ratings yet
Real-Time Workout Posture Correction Using OpenCV and MediaPipe (2022)
10 pages
Real Time Object Detection & Tracking System (Locally and Remotely) With Rotating Camera
No ratings yet
Real Time Object Detection & Tracking System (Locally and Remotely) With Rotating Camera
7 pages
Intelligent Document Processing On Aws
No ratings yet
Intelligent Document Processing On Aws
1 page
Manual Service 49 NT63E-LA
No ratings yet
Manual Service 49 NT63E-LA
54 pages
Yoga Posture Research Paper
No ratings yet
Yoga Posture Research Paper
17 pages
Penpot
No ratings yet
Penpot
4 pages
Measuring Brand Equity Across Products and Markets
No ratings yet
Measuring Brand Equity Across Products and Markets
2 pages
Master Thesis ADERINOYE Lateef Olalekan
No ratings yet
Master Thesis ADERINOYE Lateef Olalekan
54 pages
Code-First Development With Entity Framework - Sample Chapter
No ratings yet
Code-First Development With Entity Framework - Sample Chapter
12 pages
Docking Station Market Research Report 2020
No ratings yet
Docking Station Market Research Report 2020
8 pages
Introducing User Stories1
No ratings yet
Introducing User Stories1
15 pages
Open Pose
No ratings yet
Open Pose
22 pages
Musculoskeletal Physiotherapy Using Artificial Intelligence and Machine Learning
No ratings yet
Musculoskeletal Physiotherapy Using Artificial Intelligence and Machine Learning
7 pages
Deploying An Open Source Cloud Computing PDF
No ratings yet
Deploying An Open Source Cloud Computing PDF
14 pages
MoveNet A Deep Neural Network For Joint Profile Prediction Across Variable Walking Speeds and Slopes
No ratings yet
MoveNet A Deep Neural Network For Joint Profile Prediction Across Variable Walking Speeds and Slopes
11 pages
Libellula - Wizard English Manual
No ratings yet
Libellula - Wizard English Manual
46 pages
Report For Face Mask Detection Using Python and Deep Learning
100% (2)
Report For Face Mask Detection Using Python and Deep Learning
30 pages
ChatGPT in The Public Sector - Overhyped or Overl - 230424 - 122354
No ratings yet
ChatGPT in The Public Sector - Overhyped or Overl - 230424 - 122354
24 pages
Assessing Personalized Software Defect Predictors
No ratings yet
Assessing Personalized Software Defect Predictors
4 pages
RealWorldApplications AI PDF
No ratings yet
RealWorldApplications AI PDF
20 pages
3D Human Pose Estimation A Review of The Literature and Analysis of Covariates
No ratings yet
3D Human Pose Estimation A Review of The Literature and Analysis of Covariates
28 pages
Xu - 2023 - AI in HCI Design and User Experience
No ratings yet
Xu - 2023 - AI in HCI Design and User Experience
36 pages
Body Posture Detection and Motion Tracking Using Al For Medical (2022)
No ratings yet
Body Posture Detection and Motion Tracking Using Al For Medical (2022)
6 pages
Active Benchmarking
No ratings yet
Active Benchmarking
16 pages
Evaluate A Google Ads Campaign
No ratings yet
Evaluate A Google Ads Campaign
18 pages
Luno Translation Agency PDF
No ratings yet
Luno Translation Agency PDF
8 pages
Nginx and Let's Encrypt With Docker in Less Than 5 Minutes
100% (1)
Nginx and Let's Encrypt With Docker in Less Than 5 Minutes
3 pages
Last Time: Acting Humanly: The Full Turing Test: Intelligence Discussed Conditions For Considering A Machine
No ratings yet
Last Time: Acting Humanly: The Full Turing Test: Intelligence Discussed Conditions For Considering A Machine
49 pages
The Behavioral Code Recommender Systems
No ratings yet
The Behavioral Code Recommender Systems
17 pages
Greece: 2021 Annual Research: Key Highlights
No ratings yet
Greece: 2021 Annual Research: Key Highlights
2 pages
Masterclass Action Guide - TPL Evergreen 2023
No ratings yet
Masterclass Action Guide - TPL Evergreen 2023
9 pages
Disguised Market Research Report
No ratings yet
Disguised Market Research Report
19 pages
Fairbanks FB6000 Service Manual
No ratings yet
Fairbanks FB6000 Service Manual
182 pages
Cisco Ace To Nginx: Migration Guide
No ratings yet
Cisco Ace To Nginx: Migration Guide
51 pages
THE KEY SUCCESS FACTORS ON THE CUSTOMER RELATIONSHIP MANAGEMENT SYSTEM IN TRAVEL AGENCIES - PDF
No ratings yet
THE KEY SUCCESS FACTORS ON THE CUSTOMER RELATIONSHIP MANAGEMENT SYSTEM IN TRAVEL AGENCIES - PDF
10 pages
Agent-Based System Architecture and Organization
No ratings yet
Agent-Based System Architecture and Organization
9 pages
Introduction To OS - CH 1
No ratings yet
Introduction To OS - CH 1
70 pages
Sf5230eiilr en
No ratings yet
Sf5230eiilr en
2 pages
Building and Evaluating ML Models
No ratings yet
Building and Evaluating ML Models
27 pages
Yoga Pose Detection Project Synopsis
No ratings yet
Yoga Pose Detection Project Synopsis
6 pages
Marketing Plan For Start Ups
No ratings yet
Marketing Plan For Start Ups
12 pages
Digital Astroturfing
No ratings yet
Digital Astroturfing
17 pages
ORBITER User Manual
100% (1)
ORBITER User Manual
71 pages
Why I Use Notion To Organize My PHD Research
No ratings yet
Why I Use Notion To Organize My PHD Research
3 pages
SecurOS WebView User Guide
No ratings yet
SecurOS WebView User Guide
46 pages
AAScan - Open Source, Minimalist, Fully Automated 3D Scanner Based On Arduino and Android! by QLRO - Thingiverse
No ratings yet
AAScan - Open Source, Minimalist, Fully Automated 3D Scanner Based On Arduino and Android! by QLRO - Thingiverse
3 pages
Course Content Devops
No ratings yet
Course Content Devops
9 pages
9.4 Beijing National Aquatics Center
No ratings yet
9.4 Beijing National Aquatics Center
15 pages
IIIT Dharwad Placement Brochure 2024
No ratings yet
IIIT Dharwad Placement Brochure 2024
21 pages
Avamar - ADS RAID Reconfiguration-ADS Gen4S
No ratings yet
Avamar - ADS RAID Reconfiguration-ADS Gen4S
29 pages
TK - Tools Documentation: Jason R. Jones
No ratings yet
TK - Tools Documentation: Jason R. Jones
36 pages
Manual Brother
No ratings yet
Manual Brother
31 pages
What Is CPU
No ratings yet
What Is CPU
3 pages
Quick Start Guide: How To Install and Start 3D Slicer
No ratings yet
Quick Start Guide: How To Install and Start 3D Slicer
10 pages
DX Diag
No ratings yet
DX Diag
36 pages
E Tech Reviewer Midterm
No ratings yet
E Tech Reviewer Midterm
8 pages
Java Programming and Dynamic Webpage (BCA-508)
No ratings yet
Java Programming and Dynamic Webpage (BCA-508)
30 pages
Manual
No ratings yet
Manual
4 pages
FIT103-1 Computing For The Society
No ratings yet
FIT103-1 Computing For The Society
7 pages
Form 3
No ratings yet
Form 3
5 pages
Real-Time 2D Multi-Person Pose Estimation On Cpu: Lightweight Openpose
No ratings yet
Real-Time 2D Multi-Person Pose Estimation On Cpu: Lightweight Openpose
5 pages
Agfa IMPAX
No ratings yet
Agfa IMPAX
23 pages
MCUXpresso SDK Release Notes For FRDM-KL27Z
No ratings yet
MCUXpresso SDK Release Notes For FRDM-KL27Z
8 pages
Collaborative Coding Important
No ratings yet
Collaborative Coding Important
6 pages
Product Brochure - ECOSYS MA4000wifx and PA4000wx Series
No ratings yet
Product Brochure - ECOSYS MA4000wifx and PA4000wx Series
6 pages
Vector Spatial Data Types
No ratings yet
Vector Spatial Data Types
5 pages
Brosur ATG - Motherwell
No ratings yet
Brosur ATG - Motherwell
2 pages

MoveNet SinglePose Model Card

Uploaded by

MoveNet SinglePose Model Card

Uploaded by

MoveNet.

● MoveNet.SinglePose.Thunder: A higher capacity model that performs better prediction

Licensed Under Apache License, Version 2.0

Primary Intended Uses

Primary Intended Users

Out-of-scope Use Cases

COCO Val2017 Single Person Image Set

Age % dataset Keypoint mAP (Lightning) Keypoint mAP (Thunder)

Skin Tone % dataset Keypoint mAP (Lightning) Keypoint mAP (Thunder)

Active Single Person Image Set

Age % dataset Keypoint mAP (Lightning) Keypoint mAP (Thunder)

Skin Tone % dataset Keypoint mAP (Lightning) Keypoint mAP (Thunder)

Hardware GPU Inference Time Inference Time

ThinkPad v7 Intel UHD 620 39.0 ms 64.0 ms

Macbook pro Intel Iris Plus 17.8 ms 33.8 ms

Macbook pro Intel UHD 630 18.7 ms 44.6 ms

Macbook pro Radeon Pro 555X 16.4 ms 22.1 ms

Lenovo P520 GeForce RTX 2080Ti 10.5 ms 15.0 ms

You might also like