Assignment Computer Vision

The document is a submission for a computer science course on computer vision. It summarizes a research paper on multi-view 3D reconstruction using Transformers. The paper proposes using a Transformer network to predict a 3D structure from 2D image features. The method achieved high accuracy on the ShapeNet dataset and was more efficient than traditional methods. Applications of multi-view 3D reconstruction include object recognition, scene understanding, augmented reality and robotics.

Uploaded by

sheraz7288

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views4 pages

Assignment Computer Vision

Uploaded by

sheraz7288

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

UNIVERSITY OF SIALKOT

Academic Year : 2023 – 2024 Department : Computer Science

------------------------------------------------------------------------------------------
Program : MS Computer Science
------------------------------------------------------------------------------------------

Submitted by : Ali Imran Cheema Submitted to : Dr. Jahanzeb

Reg No : 1230200412 Session : Spring 2023
Subject : Computer Vision Total Credit Hours : 03
Assignment No : 1 Date of Submission : 11 / 06 / 2023

Student Sign Professor Sign

____________ ____________
Title: Multi-view Reconstruction

Authors: Hao Chen, Rui Zhao, Xiangyu Zhang, and Jian Sun
Published: March 8, 2021
Journal: arXiv

Abstract:
Multi-view 3D reconstruction is a challenging problem in computer vision. The
goal is to recover the 3D structure of an object from a set of 2D images. This
problem is difficult due to the ambiguity of the correspondence between the 2D
images and the 3D object.
In this paper, we propose a new method for multi-view 3D reconstruction based
on the Transformer architecture. The Transformer is a self-attention model that
has been shown to be effective for a variety of natural language processing tasks.
We show that the Transformer can also be used for multi-view 3D reconstruction.
Our method first extracts features from the 2D images using a convolutional
neural network. These features are then fed to a Transformer network, which
learns to predict the 3D structure of the object. The Transformer network uses
self-attention to learn the relationships between the features from different views.
We evaluated our method on the ShapeNet dataset, which contains a large number
of 3D models. Our method was able to reconstruct the 3D models with high
accuracy. We also showed that our method is more efficient than traditional
methods for multi-view 3D reconstruction.
Methods:
Our method for multi-view 3D reconstruction consists of two main steps:
Feature extraction: We first extract features from the 2D images using a
convolutional neural network. The features extracted from each image are then
stacked together to form a feature tensor.
3D reconstruction: We then use a Transformer network to predict the 3D
structure of the object from the feature tensor. The Transformer network uses self-
attention to learn the relationships between the features from different views.
Main findings:
Our method was able to reconstruct the 3D models with high accuracy. We also
showed that our method is more efficient than traditional methods for multi-view
3D reconstruction.
Conclusion:
We have proposed a new method for multi-view 3D reconstruction based on the
Transformer architecture. Our method was able to reconstruct the 3D models with
high accuracy and is more efficient than traditional methods.
Applications:
Multi-view 3D reconstruction has a wide range of applications in computer
vision, including:
 3D object recognition
 3D scene understanding
 Augmented reality
 Virtual reality
 Robotics
Our method can be used to improve the accuracy and efficiency of these
applications.
Q No. 2) Derive the perspective equation projections for a virtual
image located at a distance 𝑓′ in front of the pinhole?
Ans :
x = -f * (X / Z)
y = -f * (Y / Z)

where:

 x and y are the coordinates of the projected point on the image plane
 X and Y are the coordinates of the original point in 3D space
 Z is the distance of the original point from the pinhole
 f is the focal length of the camera

Now, let's say that the virtual image is located at a distance f' in front of the
pinhole. This means that the distance of the original point from the image
plane is now Z - f'. So, we can update the perspective projection equations
as follows:

x = -f * (X / (Z - f'))

y = -f * (Y / (Z - f'))

These equations give the coordinates (x, y) of the projection of a point P

(X, Y, Z) on the virtual image onto the image plane, taking into account the
focal length f and the distance d of the virtual image from the pinhole.
Q No. 3) Give a geometric construction of the image 𝑃′ of a point
𝑃 given the two focal points 𝐹 and 𝐹′ of a thin lens ?
Ans : Here are the steps on how to construct the image P' of a point P given the
two focal points F and F' of a thin lens:

 Draw a line segment to represent the optical axis of the lens.

 Mark the two focal points F and F' on the optical axis, with F being the
focal point on the same side of the lens as the object point P and F' being
the focal point on the opposite side of the lens.
 Draw a ray from P that is parallel to the optical axis.
 The ray will pass through F' after it is refracted by the lens.
 Draw a line from F' that is parallel to the optical axis.
 The line will intersect the ray from step 3 at point P'.

P' is the image of point P formed by the thin lens.

The End

MA205 Midterm F2022
100% (1)
MA205 Midterm F2022
9 pages
Connectives List
50% (2)
Connectives List
2 pages
Singh 2020
No ratings yet
Singh 2020
5 pages
Multi View Three Dimensional Reconstruction: Advanced Techniques for Spatial Perception in Computer Vision
From Everand
Multi View Three Dimensional Reconstruction: Advanced Techniques for Spatial Perception in Computer Vision
Fouad Sabry
No ratings yet
Chuhua Wang Deep Learning Based 3d Reconstruction From
No ratings yet
Chuhua Wang Deep Learning Based 3d Reconstruction From
23 pages
Pinhole Camera Model: Understanding Perspective through Computational Optics
From Everand
Pinhole Camera Model: Understanding Perspective through Computational Optics
Fouad Sabry
No ratings yet
Research On 3D Reconstruction Based On Multiple Views
No ratings yet
Research On 3D Reconstruction Based On Multiple Views
5 pages
The More You See in 2D The More You Perceive in 3D
No ratings yet
The More You See in 2D The More You Perceive in 3D
11 pages
Computer Stereo Vision: Exploring Depth Perception in Computer Vision
From Everand
Computer Stereo Vision: Exploring Depth Perception in Computer Vision
Fouad Sabry
No ratings yet
Fast Single-View 3D Object Reconstruction With Fine Details Through Dilated Downsample and Multi-Path Upsample Deep Neural Network
No ratings yet
Fast Single-View 3D Object Reconstruction With Fine Details Through Dilated Downsample and Multi-Path Upsample Deep Neural Network
5 pages
Procedural Surface: Exploring Texture Generation and Analysis in Computer Vision
From Everand
Procedural Surface: Exploring Texture Generation and Analysis in Computer Vision
Fouad Sabry
No ratings yet
A Real World Dataset For Multi-View 3D
No ratings yet
A Real World Dataset For Multi-View 3D
18 pages
MVD-Fusion: Single-View 3D Via Depth-Consistent Multi-View Generation
No ratings yet
MVD-Fusion: Single-View 3D Via Depth-Consistent Multi-View Generation
11 pages
Progressive Learning of 3D Reconstruction Network From 2D GAN Data
No ratings yet
Progressive Learning of 3D Reconstruction Network From 2D GAN Data
12 pages
Lecture1 PDF
No ratings yet
Lecture1 PDF
95 pages
Generalized Fringe-To-Phase Framework For Single-Shot 3D Reconstruction Integrating Structured Light With Deep Learning
No ratings yet
Generalized Fringe-To-Phase Framework For Single-Shot 3D Reconstruction Integrating Structured Light With Deep Learning
18 pages
Articulated Body Pose Estimation: Unlocking Human Motion in Computer Vision
From Everand
Articulated Body Pose Estimation: Unlocking Human Motion in Computer Vision
Fouad Sabry
No ratings yet
Programming Assignment 3 3D Reconstruction: Instructions
No ratings yet
Programming Assignment 3 3D Reconstruction: Instructions
15 pages
Displays: Shaohua Qi, Xin Ning, Guowei Yang, Liping Zhang, Peng Long, Weiwei Cai, Weijun Li
No ratings yet
Displays: Shaohua Qi, Xin Ning, Guowei Yang, Liping Zhang, Peng Long, Weiwei Cai, Weijun Li
12 pages
AutoRecon 自动检测物体并重建
No ratings yet
AutoRecon 自动检测物体并重建
10 pages
3d Reconstruction PHD Thesis
100% (3)
3d Reconstruction PHD Thesis
6 pages
3D Reconstruction Based On Stereovision and Texture Mapping
No ratings yet
3D Reconstruction Based On Stereovision and Texture Mapping
6 pages
Multiple View Intro
No ratings yet
Multiple View Intro
8 pages
SAIL-VOS 3D: A Synthetic Dataset and Baselines For Object Detection and 3D
No ratings yet
SAIL-VOS 3D: A Synthetic Dataset and Baselines For Object Detection and 3D
13 pages
Research Paper
No ratings yet
Research Paper
19 pages
3 DRec
No ratings yet
3 DRec
31 pages
Overview On 3 D Reconstruction From Images
No ratings yet
Overview On 3 D Reconstruction From Images
7 pages
Single-Shot 3D Reconstruction Via Nonlinear Fringe Transformation: Supervised and Unsupervised Learning Approaches
No ratings yet
Single-Shot 3D Reconstruction Via Nonlinear Fringe Transformation: Supervised and Unsupervised Learning Approaches
18 pages
Multiview Compressive Coding For 3D Reconstruction
No ratings yet
Multiview Compressive Coding For 3D Reconstruction
12 pages
Multiple View Geometry: in Computer Vision
No ratings yet
Multiple View Geometry: in Computer Vision
8 pages
Multiple View Geometry: in Computer Vision
No ratings yet
Multiple View Geometry: in Computer Vision
8 pages
Azinovic Neural RGB-D Surface Reconstruction CVPR 2022 Paper
No ratings yet
Azinovic Neural RGB-D Surface Reconstruction CVPR 2022 Paper
12 pages
Multiview Compressive Coding For 3D Reconstruction
No ratings yet
Multiview Compressive Coding For 3D Reconstruction
12 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
From Images To 3D Models
No ratings yet
From Images To 3D Models
7 pages
Neural Recon
No ratings yet
Neural Recon
10 pages
An Invitation To 3-D Vision PDF
No ratings yet
An Invitation To 3-D Vision PDF
338 pages
paper4cvpr
No ratings yet
paper4cvpr
12 pages
Realfusion 360 Reconstruction of Any Object From A Single Image
No ratings yet
Realfusion 360 Reconstruction of Any Object From A Single Image
20 pages
Sparse Neu S
No ratings yet
Sparse Neu S
22 pages
3D Reconstruction USING MULTIPLE 2D IMAGES
No ratings yet
3D Reconstruction USING MULTIPLE 2D IMAGES
4 pages
Ray Tracing Graphics: Exploring Photorealistic Rendering in Computer Vision
From Everand
Ray Tracing Graphics: Exploring Photorealistic Rendering in Computer Vision
Fouad Sabry
No ratings yet
An Invitation To 3-D Vision From Images To Models
No ratings yet
An Invitation To 3-D Vision From Images To Models
339 pages
1.fan - A Point Set Generation Network For 3D Object Reconstruction From A Single Image - CVPR - 2017 - Paper
No ratings yet
1.fan - A Point Set Generation Network For 3D Object Reconstruction From A Single Image - CVPR - 2017 - Paper
9 pages
A Review of Deep Learning Techniques For 3D Reconstruction of 2D Images
No ratings yet
A Review of Deep Learning Techniques For 3D Reconstruction of 2D Images
5 pages
CVR Module 3
No ratings yet
CVR Module 3
12 pages
3d Object Reconstruction
No ratings yet
3d Object Reconstruction
23 pages
Unit Iv Aicv Aids
No ratings yet
Unit Iv Aicv Aids
22 pages
Image Based Modeling and Rendering: Exploring Visual Realism: Techniques in Computer Vision
From Everand
Image Based Modeling and Rendering: Exploring Visual Realism: Techniques in Computer Vision
Fouad Sabry
No ratings yet
Pix2Vox Context-Aware 3D Reconstruction From Single and Multi-View Images
No ratings yet
Pix2Vox Context-Aware 3D Reconstruction From Single and Multi-View Images
9 pages
3D Reconstruction 2021
No ratings yet
3D Reconstruction 2021
27 pages
SIFT3 Dmain
No ratings yet
SIFT3 Dmain
15 pages
Polygon Computer Graphics: Exploring the Intersection of Polygon Computer Graphics and Computer Vision
From Everand
Polygon Computer Graphics: Exploring the Intersection of Polygon Computer Graphics and Computer Vision
Fouad Sabry
No ratings yet
LiDAR and BIM Integration For 3D Reconstruction and Preservation of Cultural Heritage Images in Ancien
No ratings yet
LiDAR and BIM Integration For 3D Reconstruction and Preservation of Cultural Heritage Images in Ancien
6 pages
2021-Single Image 3D Object Reconstruction Based On Deep Learning A Review
No ratings yet
2021-Single Image 3D Object Reconstruction Based On Deep Learning A Review
36 pages
Effective Loss Recons
No ratings yet
Effective Loss Recons
21 pages
Rendering Computer Graphics: Exploring Visual Realism: Insights into Computer Graphics
From Everand
Rendering Computer Graphics: Exploring Visual Realism: Insights into Computer Graphics
Fouad Sabry
No ratings yet
2d 3d Reconstruction
No ratings yet
2d 3d Reconstruction
11 pages
The Assessment of 3D Model Representation For Retr
No ratings yet
The Assessment of 3D Model Representation For Retr
17 pages
Distance Fog: Exploring the Visual Frontier: Insights into Computer Vision's Distance Fog
From Everand
Distance Fog: Exploring the Visual Frontier: Insights into Computer Vision's Distance Fog
Fouad Sabry
No ratings yet
Unit4 CV
No ratings yet
Unit4 CV
24 pages
Time-Distributed Framework For 3D Reconstruction Integrating Fringe Projection With Deep Learning
No ratings yet
Time-Distributed Framework For 3D Reconstruction Integrating Fringe Projection With Deep Learning
23 pages
Generalized ERF
No ratings yet
Generalized ERF
3 pages
Design For Beams: Prepared By: Engr. Jo Ann C. Celedio
No ratings yet
Design For Beams: Prepared By: Engr. Jo Ann C. Celedio
28 pages
Math S3 Term 2
No ratings yet
Math S3 Term 2
4 pages
Kaplan Garrick 1981 Quantitative Definition of Risk
No ratings yet
Kaplan Garrick 1981 Quantitative Definition of Risk
18 pages
A Structural Equation Model of Successful Aging in Korean Older Women Using Selection Optimization Compensation SOC Strategies
No ratings yet
A Structural Equation Model of Successful Aging in Korean Older Women Using Selection Optimization Compensation SOC Strategies
17 pages
Permutations & Combinations: Permutation
No ratings yet
Permutations & Combinations: Permutation
60 pages
8.4 and 8.5 Logic
No ratings yet
8.4 and 8.5 Logic
4 pages
Partial Correlation
No ratings yet
Partial Correlation
15 pages
Random Fixed Effects Sem
No ratings yet
Random Fixed Effects Sem
20 pages
MMC 2005 Gr5 Reg Ind
No ratings yet
MMC 2005 Gr5 Reg Ind
1 page
Chapter 14 - Statistics
No ratings yet
Chapter 14 - Statistics
33 pages
12th Maths Unit 1 Solutions English Medium
No ratings yet
12th Maths Unit 1 Solutions English Medium
26 pages
AIEEE 07 Maths Paper
No ratings yet
AIEEE 07 Maths Paper
20 pages
Report 2
No ratings yet
Report 2
17 pages
Economic Conditions I.Q Rich Poor: High Medium Low 160 300 140 140 100 160
No ratings yet
Economic Conditions I.Q Rich Poor: High Medium Low 160 300 140 140 100 160
3 pages
Arlig Andrew - Medieval Mereology
No ratings yet
Arlig Andrew - Medieval Mereology
352 pages
4-SSS SAS ASA and AAS Congruence PDF
No ratings yet
4-SSS SAS ASA and AAS Congruence PDF
4 pages
The Four Laws o F Black Hole Mechanics: Abstract
No ratings yet
The Four Laws o F Black Hole Mechanics: Abstract
2 pages
Wang Et Al (2013) Quantum Cognition PDF
No ratings yet
Wang Et Al (2013) Quantum Cognition PDF
17 pages
Discovering The Geographical Borders of Human Mobility
No ratings yet
Discovering The Geographical Borders of Human Mobility
8 pages
UNICAL Science Past Question 2023
0% (1)
UNICAL Science Past Question 2023
219 pages
LAS-Sample - MATH 7 WEEK 4-5
No ratings yet
LAS-Sample - MATH 7 WEEK 4-5
6 pages
October, 2014: Physics 211 Quiz 1 TIME: 60 Minutes
No ratings yet
October, 2014: Physics 211 Quiz 1 TIME: 60 Minutes
7 pages
MCQ 1
No ratings yet
MCQ 1
13 pages
Factors and Multiples
No ratings yet
Factors and Multiples
10 pages
Basic Steps in A Statistical Study: Fundamentals of Statistics
No ratings yet
Basic Steps in A Statistical Study: Fundamentals of Statistics
19 pages
1 Project Appraisal
No ratings yet
1 Project Appraisal
53 pages

Assignment Computer Vision

Uploaded by

Assignment Computer Vision

Uploaded by

UNIVERSITY OF SIALKOT

Academic Year : 2023 – 2024 Department : Computer Science

Submitted by : Ali Imran Cheema Submitted to : Dr. Jahanzeb

Student Sign Professor Sign

These equations give the coordinates (x, y) of the projection of a point P

 Draw a line segment to represent the optical axis of the lens.

P' is the image of point P formed by the thin lens.

You might also like