Project Report
Project Report
Bachelors in Technology
by
Ojas Kapre
Roll No: 1611022
Harsh Patel
Roll No: 1611032
Murtaza Patrawala
Roll No: 1611034
Tanay Raul
Roll No: 1611037
Guide
Certificate
This is to certify that the dissertation report entitled WeAR - wear clothes in
Augmented Reality submitted by Ojas Kapre (Roll no 1611022), Harsh Patel
(Roll no 1611032), Murtaza Patrawala (Roll no 1611034), Tanay Raul (Roll no
1611037) at the end of semester VIII of LY B. Tech is a bona fide record for
partial fulfillment of the requirements for the degree of Bachelor of Technology in
Computer Engineering of the University of Mumbai
_________________ _____________________
Guide Head of the Department
_________________
Principal
Date:
Place: Mumbai-77
_________________
Internal Examiner
_________________
External Examiner
Date:
Place: Mumbai-77
DECLARATION
We declare that this written thesis submission represents the work done based on our and / or
others’ ideas with adequately cited and referenced the original source. We also declare that we
have adhered to all principles of intellectual property, academic honesty and integrity as we have
not misinterpreted or fabricated or falsified any idea/data/fact/source/original work/ matter in my
submission.
We understand that any violation of the above will be cause for disciplinary action by the college
and may evoke the penal action from the sources which have not been properly cited or from
whom proper permission is not sought.
______________________________ ______________________________
Signature of the Student Signature of the Student
1611022
_______________________________ 1611032
_______________________________
Roll No. Roll No.
______________________________ ______________________________
Signature of the Student Signature of the Student
1611034
_______________________________ 1611037
_______________________________
Roll No. Roll No.
Date:
Place: Mumbai-77
Contents
List of Figures…………………………………………………………………………….. i
List of Tables……………………………………………………………………………… ii
1 Introduction……………………………………………………………………... 1
2 Literature Survey………………………………………………………………. 3
4.3.1 Introduction………………………………………..…………….. 26
5.1 Conclusions………………………………………………………………. 36
Acknowledgements………………………………………………………………………. 36
Bibliography ……………………………………………………………………………... 37
2.4 Image processing: original image (the upper-left corner), removed background
(the upper-right corner), Canny filter (the bottom-left corner), bounding box of
the object (the bottom-right corner)....................................................................... 8
List of Tables
Software Requirements
Android: API 24+
ARCore Support for smartphone
The idea of virtual trial rooms is not new and many attempts have been made in the past
to address the issues that customers face while shopping for clothes. However, with new
innovations in the field of Augmented Reality and Machine Learning, the applications became
better and more realistic. From simply projecting the static 2D image of clothes, to rendering the
3D cloth model on the user's body in a real time environment, many prototypes have been
developed.
Very primitive attempts tried to render a 2D image of cloth on screen. However this was
not real time, i.e the rendered cloth image was static and the user had to align himself to the cloth
image in order to gain a visual experience of the garment. The attempt was not beneficial as it
focuses on the alignment of the user on garment rather than the other way around.
Recent attempts have used the knowledge of Machine Learning for Human Pose
Estimation through Convolutional Neural Networks (CNN) so that the previous problem of
human aligning to the image is eliminated. The cloth image is automatically mapped to the
customer's body based on the estimation of human pose. ‘VITON: An Image-based Virtual
Try-on Network’ [1], a technical paper published by Xintong Han, Zuxuan Wu, Zhe Wu,
Ruichi Yu, Larry S. Davis from University of Maryland, College Park in June 2018 is one such
attempt which uses state-of-the-art method to transfer any 2d cloth image on a human body with
the help of advanced deep learning models namely Encoders-Decoders. The input (i.e Person
pose image and 2d cloth model) was passed through an Multi-task Encoder-Decoder model
which identified the clothed region and generated a cloth mask.
Fig 2.4 - Image processing: original image (the upper-left corner), removed background (the upper-right corner),
Canny filter (the bottom-left corner), bounding box of the object (the bottom-right corner)
and m1 = -1.028, m2 = 0.06892, b2 = 203.7 are constants obtained from the analyzed data
Estimating object height:
The equations were created according to the basic trigonometric rules and sets of tests are
described in this paper. Based on these tests, while using various objects, different heights and
different distances of objects from the camera, it was found that the deviation of the
measurement is smaller than 10%.
ACTIVITIES TIME-FRAME
Selection of algorithms X X X X
Implementation X X
Activity to be performed - Pose estimation
Implementation X X
Activity to be performed - Size estimation
Development ( Sprint 1 ) X X
Activity to be performed - UI development
Development ( Sprint 2 ) X X
Activity to be performed - 3D clothes model creation
Development ( Sprint 3 ) X X
Activity to be performed - Mapping of clothes on users
Development ( Sprint 4 ) X X
Deployment X X X X
Table 3.3 - Task and responsibility matrix
● Accurate Mapping
Proper alignment of various predefined points of 3D cloth models with actual body parts
Safety Requirements
There will be no harm or damage incurred from this product
Security Requirements
● Data acquired from the user should be private and.
● Photos/Video Frames that might be captured during mapping should be private.
Database Requirements
There will be a database which will store designer names, cloth type ( full t-shirt, half sleeves,
etc), cloth color, cloth pattern, gender, clothing size, etc.
Software Quality Attributes
The important quality attributes for the user are reliability and correctness. The user size
estimation should be very accurate. Also the pose estimation and mapping of the clothes model
should be done in real time for better user experience
Business Rules
The business model will be to provide services for online clothes retailers like flipkart, amazon,
myntra, etc. This product will also help fashion designers to design and try different new clothes
patterns.
● Client-side implementation
○ Mapping Operation
The main objective of mapping operation is to superimpose the 3D cloth model
on the user’s body.
The input to this operation are:-
● 2d Pose Estimation Points
● 3d Pose Estimation Points
● Selected 3d cloth model
● User Video
The Solution we came up with to deal with these Issue is as follows:-
Conversion from 3d pose estimation points to Unity coordinates.
This was achieved with the help of Camera.ScreenToWorldPoint function in Unity.
4.3.1. Introduction
This chapter documents and tracks the necessary information required to effectively define the
approach to be used in the testing of the project’s product. The Test Plan document is created
during the Planning Phase of the project. Its intended audience is the project manager, project
team, and testing team.
4.3.1.1 Test Approach
Proactive approach: Proactive approach involves anticipating possible bugs while the modules
are being made and resolving them as when they are uncovered. So the integration product is
easier to debug thereby reducing the number of variables.
4.3.2. Test Plan
4.3.2.1 Features and Modules to be tested
1) 2d pose estimation module
2) 3d pose estimation module
3) Video resolution conversion module
4) Estimated Pose scaling scripts
5) Selecting multiple patterns feature
6) Uploading a new pattern feature
7) Uploading a new video feature
8) Asynchronous Video Processing Module
9) Authentication module
4.3.2.2 Testing Method and Tools Used
Some modules were tested using an Automated Testing tool named ZAPTEST, while the
rest of the modules and features were manually tested.
Fig 4.5 - App Demo Screenshot 1 Fig 4.6 - App Demo Screenshot 2
3d Pose Estimation model analysis:- The research paper of this project states that the accuracy of
this model is about 45.5mm. The metric is an average error in millimetres between the ground
truth and our prediction across all joints and cameras, after alignment of the root (central hip)
joint.
However after debugging the 3d joint points in Unity, we found out that the accuracy of this
model on our project dataset is very less. Due to low accuracy our mapping algorithm wasn’t
Processing analysis:-
Processing time for Video with 300 frames
Machine spec:-
Intel i5 core processor (8 multithreaded processors)
8GB RAM
Model Time
Total → 10 minutes
Table 4.1 - Processing Time Analysis
5.2 Acknowledgement
A project is the creative work of many minds. A proper synchronization between individuals is a
must for any project to be successfully completed. It is only due to the complete dedication of
students that is combined with the guidelines of the college professors that any task can be
completed.
assisted in the learning and successful implementation of the Project ’WeAR’. We would like to
thank her for constantly motivating and pushing us to work harder, which made this project into
reality. We also express our appreciation to Prof. Bharathi HN for sharing invaluable and
priceless insights with us during the course of developing the project and several project
seminars. We are also immensely grateful to our institute for providing us with the infrastructure
support.
[1] Xintong Han, Zuxuan Wu, Zhe Wu, Ruichi Yu and Larry S. Davis, “VITON: An
Image-based Virtual Try-on Network”. [https://arxiv.org/abs/1711.08447]
[2] Akshay Shirsat, Samruddhi Sonimindia, Sushmita Patil, Nikita Kotecha, Prof Shweta
Koparde, "Virtual Trial Room", International Journal of Research in Advent Technology
(IJRAT), VOLUME-7 ISSUE-5, MAY 2019, pp. 182-185.
[https://doi.org/10.32622/ijrat.75201976]
[3] Amoli Vani, Dhwani Mehta and Prof. Suchita Patil, “Virtual Changing Room”. K. J.
Somaiya College of Engineering.
[4] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov and Liang-Chieh
Chen, “MobileNetV2: Inverted Residuals and Linear Bottlenecks”. IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), 2018.
[https://arxiv.org/abs/1801.04381]
[5] Ondrej Kainz, František Jakab, Matúš W. Horečný and Dávid Cymbalák, “Estimating the
Object Size from Static 2D Image”. 2015 International Conference and Workshop on
Computing and Communication (IEMCON).
[https://ieeexplore.ieee.org/document/7344423]
[6] Pose Estimation for mobile (Implementation of CPM and Hourglass model using
TensorFlow and MobileNetV2)
[https://github.com/edvardHua/PoseEstimationForMobile]
[7] Alejandro Newell, Kaiyu Yang, and Jia Deng, “Stacked Hourglass Networks for Human
Pose Estimation”. [https://arxiv.org/pdf/1603.06937.pdf]
[8] OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Field.
[https://arxiv.org/abs/1812.08008]
[9] A simple yet effective baseline for 3d human pose estimation.
[https://arxiv.org/pdf/1705.03098.pdf]