0% found this document useful (0 votes)
18 views

03 Two-View Geometry

Uploaded by

gzhth776vf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

03 Two-View Geometry

Uploaded by

gzhth776vf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Note) Slides and example codes are available:

https://github.com/mint-lab/3dv_tutorial

An Invitation to 3D Vision:
Two-View Geometry

Sunglok Choi, Assistant Professor, Ph.D.


Computer Science and Engineering Department, SEOULTECH
[email protected] | https://mint-lab.github.io/
Review) Absolute Camera Pose Estimation

▪ Example) Pose estimation (book) + camera calibration – initially given K [pose_estimation_book3.py]


(due to wrong initial K)
- Load the cover image
- Extract ORB features
- Build 3D object points 𝐗1 , 𝐗 2 , …

Pass tentative matches Pass correct matches


- Grab an input image [Outlier Rejection] [Camera Pose Estimation]
- Extract ORB features 𝐱1 , 𝐱 2 , … - Remove wrong matches using - Estimate camera parameters using
- Match them to the object features cv::findHomography() cv::calibrateCamera()
(free from K)

[Camera Pose Update]


- Intrinsic: K (initially known)
Draw the box using cv::projectPoints()
- Extrinsic: R, 𝐭

2
Table of Contents: Two-view Geometry

▪ Planar Homography
▪ Epipolar Geometry
– Epipolar constraint
– Fundamental and essential matrix
▪ Relative Camera Pose Estimation
▪ Triangulation

윤두서(1668-1715) 자화상, 국보 제240호


Korean National Treasure No. 240 3
Planar Homography

plane

𝐱
𝐱′
𝐱 ′ = H𝐱

transformation H

shear
rotation composition

translation

perspective
projection
scaling / aspect ratio
4
Planar Homography
Euclidean Transformation Similarity Transformation Affine Transformation Projective Transformation
(a.k.a. Rigid Transform) (a.k.a. Planar Homography)

cos 𝜃 − sin 𝜃 𝑡𝑥 𝑠 cos 𝜃 −𝑠 sin 𝜃 𝑡𝑥 𝑎11 𝑎12 𝑡𝑥 𝑎11 𝑎12 𝑡𝑥


Matrix Forms H sin 𝜃 cos 𝜃 𝑡𝑦 𝑠 sin 𝜃 𝑠 cos 𝜃 𝑡𝑦 𝑎21 𝑎22 𝑡𝑦 𝑎21 𝑎22 𝑡𝑦
0 0 1 0 0 1 0 0 1 𝑣1 𝑣2 1

DOF 3 4 6 8

Transformations
- rotation O O O O
- translation O O O O
- scaling X O O O
- aspect ratio X X O O
- shear X X O O
- perspective projection X X X O
Invariants
- length O X X X
- angle O O X X
- ratio of lengths O O X X
- parallelism O O O X
- incidence O O O O
- cross ratio O O O O
cv.getAffineTransform() cv.getPerspectiveTransform()
cv.estimateRigidTransform() -
OpenCV APIs
- cv.findHomography()
cv.warpAffine() cv.warpPerspective()

Note) Similarly 3D transformations (3D-3D geometry) are represented as 4x4 matrices. 5


Note) Affine Transformation

Image: Wikipedia 6
Note) Affine Transformation

▪ Q) What is difference between two images?


– Parallel projection vs. Perspective projection
• Affine camera (a.k.a. weak perspective camera; 𝑓 = ∞)
• Less natural but (sometimes) useful in technical visualization

Image: Scatchpixel, Wikipedia 7


Planar Homography

▪ Planar homography estimation


– Unknown: Planar homography H (8 DOF)
– Given: Point correspondence (𝐱1 , 𝐱1′ ), …, (𝐱 𝑛 , 𝐱 𝑛′ )
– Constraints: 𝑛 x projective transformation 𝐱 𝑖′ = H𝐱 𝑖
– Solutions (𝑛 ≥ 4) → 4-point algorithm
• OpenCV: cv.getPerspectiveTransform() and cv.findHomography()
• Note) More simplified transformations need less number of minimal correspondence.
– Affine (𝑛 ≥ 3), similarity (𝑛 ≥ 2), Euclidean (𝑛 ≥ 2)
– Note) Planar homography can be decomposed as relative camera pose.
• OpenCV: cv.decomposeHomographyMat()
• The decomposition needs to know camera matrices.
𝐱𝑖
𝐱 𝑖′
𝐱 𝑖′ = H𝐱 𝑖

relative pose R, 𝐭
8
Planar Homography

▪ Example) Perspective distortion correction [perspective_correction.py]

pts_dst
𝐱 𝑖′ = H𝐱 𝑖

Click

pts_src

9
Planar Homography

▪ Example) Perspective distortion correction [perspective_correction.py]


def mouse_event_handler(event, x, y, flags, param):
if event == cv.EVENT_LBUTTONDOWN:
param.append((x, y))

if __name__ == '__main__':
img_file = '../data/sunglok_card.jpg'
card_size = (450, 250)
offset = 10

# Prepare the rectified points


pts_dst = np.array([[0, 0], [card_size[0], 0], [0, card_size[1]], [card_size[0], card_size[1]]])

# Load an image
img = cv.imread(img_file)

# Get the matched points from mouse clicks


pts_src = []
wnd_name = 'Perspective Correction: Point Selection'
cv.namedWindow(wnd_name)
cv.setMouseCallback(wnd_name, mouse_event_handler, pts_src)
while len(pts_src) < 4:
img_display = img.copy()
cv.rectangle(img_display, (offset, offset), (offset + card_size[0], offset + card_size[1]), (0, 0, 255), 2)
idx = min(len(pts_src), len(pts_dst))
cv.circle(img_display, offset + pts_dst[idx], 5, (0, 255, 0), -1)
cv.imshow(wnd_name, img_display) 10
Planar Homography

▪ Example) Perspective distortion correction [perspective_correction.py]


if __name__ == '__main__':
img_file = '../data/sunglok_card.jpg'
card_size = (450, 250)
offset = 10

# Prepare the rectified points


pts_dst = np.array([[0, 0], [card_size[0], 0], [0, card_size[1]], [card_size[0], card_size[1]]])

# Load an image
img = cv.imread(img_file)

# Get the matched points from mouse clicks


pts_src = []
...

if len(pts_src) == 4:
# Calculate planar homography and rectify perspective distortion
H, _ = cv.findHomography(np.array(pts_src), pts_dst)
img_rectify = cv.warpPerspective(img, H, card_size)

# Show the rectified image


cv.imshow('Perspective Correction: Rectified Image', img_rectify)
cv.waitKey(0)

cv.destroyAllWindows()
11
Planar Homography

▪ Example) Planar image stitching [image_stitching.py]


# Load two images
img1 = cv.imread('../data/hill01.jpg')
img2 = cv.imread('../data/hill02.jpg')

# Retrieve matching points


brisk = cv.BRISK_create()
keypoints1, descriptors1 = brisk.detectAndCompute(img1, None)
keypoints2, descriptors2 = brisk.detectAndCompute(img2, None)

fmatcher = cv.DescriptorMatcher_create('BruteForce-Hamming')
match = fmatcher.match(descriptors1, descriptors2)

# Calculate planar homography and merge them


pts1, pts2 = [], []
for i in range(len(match)):
pts1.append(keypoints1[match[i].queryIdx].pt)
pts2.append(keypoints2[match[i].trainIdx].pt)
pts1 = np.array(pts1, dtype=np.float32)
pts2 = np.array(pts2, dtype=np.float32)

H, inlier_mask = cv.findHomography(pts2, pts1, cv.RANSAC)


img_merged = cv.warpPerspective(img2, H, (img1.shape[1]*2, img1.shape[0]))
img_merged[:,:img1.shape[1]] = img1 # Copy

# Show the merged image


img_matched = cv.drawMatches(img1, keypoints1, img2, keypoints2, match, None, None, None, 12
Planar Homography

▪ Example) 2D video stabilization [video_stabilization.py]


# Open a video and get the reference image and feature points
video = cv.VideoCapture('../data/traffic.avi')

_, gray_ref = video.read()
if gray_ref.ndim >= 3:
gray_ref = cv.cvtColor(gray_ref, cv.COLOR_BGR2GRAY)
pts_ref = cv.goodFeaturesToTrack(gray_ref, 2000, 0.01, 10)

# Run and show video stabilization A shaking CCTV video


while True:
# Read an image from `video`
valid, img = video.read()
if not valid:
break
if img.ndim >= 3:
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
else:
gray = img.copy()

# Extract optical flow and calculate planar homography


pts, status, err = cv.calcOpticalFlowPyrLK(gray_ref, gray, pts_ref, None)
H, inlier_mask = cv.findHomography(pts, pts_ref, cv.RANSAC)

# Synthesize a stabilized image


warp = cv.warpPerspective(img, H, (img.shape[1], img.shape[0]))
13
Planar Homography

▪ Assumption) A plane is observed by two views.


– Perspective distortion correction: A complete plane
– Planar image stitching: An approximated plane (← distance ≫ depth variation)
– 2D video stabilization: An approximated plane (← small motion)

14
𝐗
Epipolar Geometry

▪ Epipolar constraint
⊺ image plane
𝐱 ′ F𝐱 = 0 (F: Fundamental matrix on the image plane)
↕ (𝐱 = Kො𝐱)
𝐱 𝐱′

𝐱ො ′⊺ K ′⊺ FKො𝐱 = 0 camera #1
camera #2
↕ (E = K ′⊺ FK)
relative pose R, 𝐭
𝐱ො ′⊺ Eො𝐱 = 0 (E: Essential matrix on the normalized image plane)

normalized image plane

𝐱ො 𝐱ො ′

camera #1
camera #2

relative pose R, 𝐭

15
Epipolar Geometry

▪ Epipolar constraint: Derivation


𝐗 ′ = R𝐗 + 𝐭
↓ (𝐗 = 𝜆ො𝐱)
𝜆′ො𝐱 ′ = 𝜆Rො𝐱 + 𝐭
𝐭 ×↓
𝜆′ 𝐭 × 𝐱ො ′ = 𝜆𝐭 × Rො𝐱
𝐱ො ′ ∙↓
𝜆′ 𝐱ො ′ ∙ (𝐭 × 𝐱ො ′ ) = 𝜆ො𝐱 ′ ∙ 𝐭 × Rො𝐱 𝐗

↓ (ො𝐱 ′ ∙ (𝐭 × 𝐱ො ′ ) = 0)
𝜆ො𝐱 ′ ∙ 𝐭 × Rො𝐱 = 0
0 −𝑡𝑧 𝑡𝑦
↓ (E = 𝐭 × R or E = 𝐭 × R) Note) 𝐭 × = 𝑡𝑧 0 −𝑡𝑥 normalized image plane
−𝑡𝑦 𝑡𝑥 0
𝐱ො ′ Eො𝐱 = 0 𝐱ො 𝐱ො ′

camera #1
camera #2

relative pose R, 𝐭

16
Epipolar Geometry

▪ Epipolar geometry
– Baseline
𝐗
• Distance between two camera centers, 𝐂 and 𝐂′
– Epipolar plane
• Plane generated from three points, 𝐗, 𝐂, and 𝐂′ epipolar plane
image plane
– Epipole (a.k.a. epipolar point)
𝐱 𝐱′
• Projection of other camera centers
𝐂
• 𝐞 = P𝐂′ and 𝐞′ = P 𝐂 ′
camera #1
𝐞 baseline 𝐞′ 𝐂′ camera #2
– e.g. P = K[I|0] and P = K [R|𝐭]
′ ′

• 𝐞 = −KR⊺ 𝐭 and 𝐞′ = K ′ 𝐭 epipole

17
Epipolar Geometry

▪ Epipolar geometry
– Epipolar line
𝐗
⊺ ⊺
• 𝐱 ′ F𝐱 = 0 → 𝐱 ′ 𝐥′ = 0 where 𝐥′ = F𝐱
– 𝐱 ′ will lie on the line 𝐥′.
epipolar line 𝐥′ = F𝐱
′⊺
• 𝐱 F𝐱 = 0 → 𝐥⊺ 𝐱 = 0 where 𝐥 = F ⊺ 𝐱’ image plane
– 𝐱 will lie on the line 𝐥. 𝐱
• The search space of feature matching is reduced
from a image plane to the epipolar line. camera #1
𝐞 baseline 𝐞′ camera #2
– Note) Every epipolar line intersects at the epipole.
epipole
• F𝐞 = 0
• 𝐞 is the null space of F. Special case) Stereo cameras

epipolar line

relative pose epipole at infinite


R = I3×3 , 𝐭 = −𝑏, 0, 0 18
Epipolar Geometry

▪ Example) Epipolar line visualization [epipolar_line_visualization.py]

Image #2

Image #2

Image #1

Image #1 19
Epipolar Geometry

▪ Example) Epipolar line visualization [epipolar_line_visualization.py]


def mouse_event_handler(event, x, y, flags, param):
if event == cv.EVENT_LBUTTONDOWN:
param.append((x, y)) pt1

def draw_straight_line(img, line, color, thickness=1):


h, w, *_ = img.shape
a, b, c = line # Line: ax + by + c = 0
if abs(a) > abs(b):
pt1 = (int(c / -a), 0)
pt2 = (int((b*h + c) / -a), h)
else: pt2
...
cv.line(img, pt1, pt2, color, thickness)

if __name__ == '__main__':
# Load two images
img1 = cv.imread('../data/KITTI07/image_0/000000.png', cv.IMREAD_COLOR)
img2 = cv.imread('../data/KITTI07/image_0/000023.png', cv.IMREAD_COLOR)
# Note) `F` is derived from `fundamental_mat_estimation.py`.
F = np.array([[ 3.34638533e-07, 7.58547151e-06, -2.04147752e-03], ...])

# Register event handlers and show images


wnd1_name, wnd2_name = 'Epipolar Line: Image #1', 'Epipolar Line: Image #2'
img1_pts, img2_pts = [], []
cv.namedWindow(wnd1_name)
cv.setMouseCallback(wnd1_name, mouse_event_handler, img1_pts) 20
Epipolar Geometry

▪ Example) Epipolar line visualization [epipolar_line_visualization.py]


if __name__ == '__main__':
# Load two images
img1 = cv.imread('../data/KITTI07/image_0/000000.png', cv.IMREAD_COLOR)
img2 = cv.imread('../data/KITTI07/image_0/000023.png', cv.IMREAD_COLOR)
F = np.array([[ 3.34638533e-07, 7.58547151e-06, -2.04147752e-03], ...])

# Register event handlers and show images


...

# Get a point from a image and draw its correponding epipolar line on the other image
while True:
if len(img1_pts) > 0:
for x, y in img1_pts:
color = (random.randrange(256), random.randrange(256), random.randrange(256))
cv.circle(img1, (x, y), 4, color, -1)
epipolar_line = F @ [[x], [y], [1]] 𝐥′ = F𝐱
draw_straight_line(img2, epipolar_line, color, 2)
img1_pts.clear()
if len(img2_pts) > 0:
for x, y in img2_pts:
color = (random.randrange(256), random.randrange(256), random.randrange(256))
cv.circle(img2, (x, y), 4, color, -1)
epipolar_line = F.T @ [[x], [y], [1]]
draw_straight_line(img1, epipolar_line, color, 2) 𝐥 = F ⊺ 𝐱′
img2_pts.clear()
cv.imshow(wnd2_name, img2) 21
Relative Camera Pose Estimation

▪ Relative camera pose estimation (~ fundamental/essential matrix estimation)


– Unknown: Rotation and translation R, 𝐭 (5 DOF; up-to scale “scale ambiguity”) R, 𝐭
↕ (E = 𝐭 × R)
– Given: Point correspondence (𝐱1 , 𝐱1′ ), …, (𝐱 𝑛 , 𝐱 𝑛′ ) and camera matrices K, K′
E
– Constraints: 𝑛 x epipolar constraint (𝐱′⊺ F𝐱 = 0 or 𝐱ො ′⊺ Eො𝐱 = 0) ↕ (E = K ′⊺ FK)
F
– Solutions) Fundamental matrix: 7/8-point algorithm (7 DOF; intrinsic+extrinsic)
• Properties: det(F) = 0 (or rank(F) = 2)
• Estimation: cv.findFundamentalMat() → 1 solution
• Conversion to E: E = K′⊺ FK
– Solutions) Essential matrix: 5-point algorithm (5 DOF; extrinsic)
• Properties: det(E) = 0 and 2EE ⊺ E − tr EE ⊺ E = 0
• Estimation: cv.findEssentialMat() → 𝑘 solutions
• Decomposition: cv.decomposeEssentialMat() → 4 solutions “relative pose ambiguity”
• Decomposition with positive-depth check: cv.recoverPose() → 1 solution

22
Relative Camera Pose Estimation: Scale Ambiguity

The Sandcrawler @ Star Wars IV: A New Hope (1977)

(a) 2-meter-wide tunnel (b) 1-meter-wide tunnel

Reference: Choi et al., Resolving Scale Ambiguity in Monocular Visual Odometry, URAI, 2013 23
Relative Camera Pose Estimation: Scale Ambiguity

▪ How to resolve scale ambiguity


– Additional sensors: Speedometers (odometers), IMUs, GPSs, depth/distance (stereo, RGB-D, LiDAR, …)
– Motion constraints: Known initial translation, Ackerman’s steering kinematics
– Observation constraints: Known size of objects, known and constant height of camera

1
𝐿
∴𝜌=
𝑙
𝜌

𝑙
𝐿

(a) The reconstructed world (b) The real world

Reference: Choi et al., Resolving Scale Ambiguity in Monocular Visual Odometry, URAI, 2013 24
Relative Camera Pose Estimation: Relative Pose Ambiguity

▪ How to resolve pose ambiguity


– Positive depth constraint

Image: Hartley and Zisserman, Multiple View Geometry in Computer Vision (2nd Edition), Cambridge University Press, 2004 25
Relative Camera Pose Estimation

Example) Fundamental matrix estimation [fundamental_mat_estimation.py]


# Load two images
img1 = cv.imread('../data/KITTI07/image_0/000000.png')
img2 = cv.imread('../data/KITTI07/image_0/000023.png’)
f, cx, cy = 707.0912, 601.8873, 183.1104 # From the KITTI dataset
K = np.array([[f, 0, cx], [0, f, cy], [0, 0, 1]])

# Retrieve matching points


brisk = cv.BRISK_create()
keypoints1, descriptors1 = brisk.detectAndCompute(img1, None)
keypoints2, descriptors2 = brisk.detectAndCompute(img2, None)

fmatcher = cv.DescriptorMatcher_create('BruteForce-Hamming')
match = fmatcher.match(descriptors1, descriptors2)

26
Relative Camera Pose Estimation

Example) Fundamental matrix estimation [fundamental_mat_estimation.py]


# Load two images
...

# Retrieve matching points


...

# Calculate the fundamental matrix


pts1, pts2 = [], []
for i in range(len(match)):
pts1.append(keypoints1[match[i].queryIdx].pt)
pts2.append(keypoints2[match[i].trainIdx].pt)
pts1 = np.array(pts1, dtype=np.float32)
pts2 = np.array(pts2, dtype=np.float32)
F, inlier_mask = cv.findFundamentalMat(pts1, pts2, cv.FM_RANSAC, 0.5, 0.999)

27
Relative Camera Pose Estimation

Example) Fundamental matrix estimation [fundamental_mat_estimation.py]


# Load two images
...
f, cx, cy = 707.0912, 601.8873, 183.1104 # From the KITTI dataset
K = np.array([[f, 0, cx], [0, f, cy], [0, 0, 1]])

# Retrieve matching points


...

# Calculate the fundamental matrix


...
F, inlier_mask = cv.findFundamentalMat(pts1, pts2, cv.FM_RANSAC, 0.5, 0.999)
Image #2
# Extract relative camera pose between two images [-0.57, 0.09, 0.82]
E = K.T @ F @ K
positive_num, R, t, positive_mask = cv.recoverPose(E, pts1, pts2, K, mask=inlier_mask)
...
print(f'* The position of Image #2 = {-R.T @ t}') # [-0.57, 0.09, 0.82]
Image #1
[0, 0, 0]

28
Relative Camera Pose Estimation

▪ Example) Monocular Visual Odometry (Epipolar Version) [vo_epipolar.cpp]

Image 𝒏 − 𝟏 Image 𝒏

Relative pose R, 𝐭

Given K Use_5pt = true


[Tentative Matching] [Outlier Rejection] [Relative Pose Estimation]
- Extract optical flow - Remove wrong matches using - Decompose camera pose using
between adjacent images cv::findEssentialMat() cv::recoverPose()
Pass tentative matches Pass the essential matrix E

[Relative Pose Accumulation]


−1
R 𝐭
0
T𝑛0 = T𝑛−1 Δ𝑛−1
𝑛
0
= T𝑛−1 Pass the relative pose R, 𝐭 (Note: ∥ 𝐭 ∥ = 1)
0 1

29
Relative Camera Pose Estimation

▪ Example) Monocular Visual Odometry (Epipolar Version) [vo_epipolar.cpp]


4X

Given K Use_5pt = true


[Tentative Matching] [Outlier Rejection] [Relative Pose Estimation]
- Extract optical flow - Remove wrong matches using - Decompose camera pose using
between adjacent images cv::findEssentialMat() cv::recoverPose()
Pass tentative matches Pass the essential matrix E

[Relative Pose Accumulation]


−1
R 𝐭
0
T𝑛0 = T𝑛−1 Δ𝑛−1
𝑛
0
= T𝑛−1 Pass the relative pose R, 𝐭 (Note: ∥ 𝐭 ∥ = 1)
0 1

30
Relative Camera Pose Estimation

▪ Example) Monocular Visual Odometry (Epipolar Version) [vo_epipolar.cpp]


video_file = '../data/KITTI07/image_0/%06d.png'
f, cx, cy = 707.0912, 601.8873, 183.1104
use_5pt = True
min_inlier_num = 100
min_inlier_ratio = 0.2
traj_file = '../data/vo_epipolar.xyz'

# Open a video and get an initial image


video = cv.VideoCapture(video_file)

_, gray_prev = video.read()
if gray_prev.ndim >= 3 and gray_prev.shape[2] > 1:
gray_prev = cv.cvtColor(gray_prev, cv.COLOR_BGR2GRAY)

# Prepare a plot to visualize the camera trajectory


...

# Run the monocular visual odometry


K = np.array([[f, 0, cx], [0, f, cy], [0, 0, 1]])
camera_traj = np.zeros((1, 3))
camera_pose = np.eye(4)
while True:
# Grab an image from the video
valid, img = video.read()
if not valid:
break 31
Relative Camera Pose Estimation

▪ Example) Monocular Visual Odometry (Epipolar Version) [vo_epipolar.cpp]


# Run the monocular visual odometry
K = np.array([[f, 0, cx], [0, f, cy], [0, 0, 1]])
camera_traj = np.zeros((1, 3))
camera_pose = np.eye(4)
while True:
# Grab an image from the video
valid, img = video.read()
if not valid:
break
if img.ndim >= 3 and img.shape[2] > 1:
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
else:
gray = img.copy()

# Extract optical flow


pts_prev = cv.goodFeaturesToTrack(gray_prev, 2000, 0.01, 10)
pts, status, err = cv.calcOpticalFlowPyrLK(gray_prev, gray, pts_prev, None)
gray_prev = gray

# Calculate relative pose


if use_5pt:
E, inlier_mask = cv.findEssentialMat(pts_prev, pts, f, (cx, cy), cv.FM_RANSAC, 0.99, 1)
else:
F, inlier_mask = cv.findFundamentalMat(pts_prev, pts, cv.FM_RANSAC, 1, 0.99)
E = K.T @ F @ K
inlier_num, R, t, inlier_mask = cv.recoverPose(E, pts_prev, pts, focal=f, pp=(cx, cy), mask=inlier_mask) 32
Relative Camera Pose Estimation

▪ Example) Monocular Visual Odometry (Epipolar Version) [vo_epipolar.cpp]


# Run the monocular visual odometry
while True:
# Grab an image from the video
...

# Extract optical flow


...

# Calculate relative pose


if use_5pt:
E, inlier_mask = cv.findEssentialMat(pts_prev, pts, f, (cx, cy), cv.FM_RANSAC, 0.99, 1)
else:
F, inlier_mask = cv.findFundamentalMat(pts_prev, pts, cv.FM_RANSAC, 1, 0.99)
E = K.T @ F @ K
inlier_num, R, t, inlier_mask = cv.recoverPose(E, pts_prev, pts, focal=f, pp=(cx, cy), mask=inlier_mask)
inlier_ratio = inlier_num / len(pts)

# Accumulate relative pose if result is reliable


info_color = (0, 255, 0)
if inlier_num > min_inlier_num and inlier_ratio > min_inlier_ratio:
T = np.eye(4)
T[:3, :3] = R
T[:3, 3] = t.flatten() −1
0 0 R 𝐭
camera_pose = camera_pose @ np.linalg.inv(T) T𝑛0 = T𝑛−1 Δ𝑛−1
𝑛 = T𝑛−1
0 1
info_color = (0, 0, 255)
33
Relative Camera Pose Estimation

▪ Relative camera pose estimation (~ fundamental/essential matrix estimation)


– Unknown: Rotation and translation R, 𝐭 (5 DOF; up-to scale “scale ambiguity”) R, 𝐭
↕ (E = 𝐭 × R)
– Given: Point correspondence (𝐱1 , 𝐱1′ ), …, (𝐱 𝑛 , 𝐱 𝑛′ ) and camera matrices K, K′
E
– Constraints: 𝑛 x epipolar constraint (𝐱′⊺ F𝐱 = 0 or 𝐱ො ′⊺ Eො𝐱 = 0) ↕ (E = K ′⊺ FK)
F
– Solutions) Fundamental matrix: 7/8-point algorithm (7 DOF; intrinsic+extrinsic)
• Properties: det(F) = 0 (or rank(F) = 2)
• Estimation: cv.findFundamentalMat() → 1 solution
• Conversion to E: E = K′⊺ FK
• Degenerate cases: No translation, correspondence from a single plane
– Solutions) Essential matrix: 5-point algorithm (5 DOF; extrinsic)
• Properties: det(E) = 0 and 2EE ⊺ E − tr EE ⊺ E = 0
• Estimation: cv.findEssentialMat() → 𝑘 solutions
• Decomposition: cv.decomposeEssentialMat() → 4 solutions “relative pose ambiguity”
• Decomposition with positive-depth check: cv.recoverPose() → 1 solution
• Degenerate cases: No translation (∵ E = 𝐭 × R)

34
Relative Camera Pose Estimation

▪ Relative camera pose estimation


– Solutions) Fundamental matrix: 7/8-point algorithm (7 DOF; intrinsic+extrinsic)
• Estimation: cv.findFundamentalMat() → 1 solution
• Conversion to E: E = K′⊺ FK
complementary

• Degenerate cases: No translation, correspondence from a single plane


– Solutions) Essential matrix: 5-point algorithm (5 DOF; extrinsic)
• Estimation: cv.findEssentialMat() → 𝑘 solutions
• Decomposition: cv.decomposeEssentialMat() → 4 solutions “relative pose ambiguity”
• Decomposition with positive-depth check: cv.recoverPose() → 1 solution
• Degenerate cases: No translation (∵ E = 𝐭 × R)
– Solutions) Planar homography: 4-point algorithm (8 DOF; up-to scale “scale ambiguity” )
• Estimation: cv.findHomography() → 1 solutions
• Conversion to calibrated H: H
෡ = K′−1 HK

• Decomposition: cv.decomposeHomographyMat() → 4 solutions “relative pose ambiguity”


• Degenerate cases: Correspondence not from a single plane

35
Relative Camera Pose Estimation

▪ Relative camera pose estimation


– Solutions) Planar homography: 4-point algorithm (8 DOF; up-to scale “scale ambiguity” )
• Estimation: cv.findHomography() → 1 solutions
• Derivation
𝜆′ 𝐱ො ′ = 𝜆Rො𝐱 + 𝐭
1 ⊺
↓ 𝐧 𝐱ො = 1 (∵ 𝑛𝑥 𝑥ො + 𝑛𝑦 𝑦ො + 𝑛𝑧 𝑧Ƹ − 𝑑 = 0)
𝑑
1
𝐱ො ′ = 𝜆′′ R + ′
𝐭 𝐧⊺ 𝐱ො
𝑑
1
↓ H
෡ =R+ 𝐭 𝐧⊺
𝑑′
෡ 𝐱ො
𝐱ො ′ = H
↓ 𝐱ො = K −1 𝐱 and 𝐱ො ′ = K′−1 𝐱′
𝐱′ = H 𝐱
• Conversion to calibrated H: H
෡ = K′−1 HK

• Decomposition: cv.decomposeHomographyMat() → 4 solutions “relative pose ambiguity”


• Degenerate cases: Correspondence not from a single plane
36
Relative Camera Pose Estimation: Overview
Items General 2D-2D Geometry Planar 2D-2D Geometry
Model Fundamental Matrix (7 DOF) Planar Homography (8 DOF)
Formulation F = K′−⊺ EK −1 E = K′⊺ FK ෡ K −1
H = K′H ෡ = K′−1 HK
H
On Image Planes

Estimation - 7-point algorithm (𝑛 ≥ 7) → 𝑘 solution - 4-point algorithm (𝑛 ≥ 4) → 1 solution


- (normalized) 8-point algorithm → 1 solution - cv::findHomography()
- cv.findFundamentalMat()
Input - (𝐱𝑖 , 𝐱𝑖′ ) [px] on the image plane - (𝐱𝑖 , 𝐱𝑖′ ) [px] on a plane in the image plane
Degenerate - No translational motion - Correspondence not from a single plane
Cases - Correspondence from a single plane
Decomposition - Convert to an essential matrix and decompose it - cv.decomposeHomographyMat()
to R and 𝐭
Model Essentials Matrix (5 DOF) (Calibrated) Planar Homgraphy (8 DOF)
On Normalized Image Planes

Formulation 1
E = 𝐭 ×R ෡ = R + 𝐭𝐧⊺
H
𝑑
Estimation - 5-point algorithm (𝑛 ≥ 5) → k solution - 4-point algorithm (𝑛 ≥ 4) → 1 solution
- cv.findEssentialMat() - cv::findHomography()
Input - (ො𝐱 𝑖 , 𝐱ො 𝑖′ ) [m] on the normalized image plane - (ො𝐱 𝑖 , 𝐱ො 𝑖′ ) [m] on a plane in the normalized image plane
Degenerate - No translational motion - Correspondence not from a single plane
Cases
Decomposition - cv.decomposeEssentialMat() - cv.decomposeHomographyMat() with K = I3×3
to R and 𝐭 - cv.recoverPose()
37
Triangulation

▪ Triangulation (point localization)


– Unknown: Position of a 3D point 𝐗 (3 DOF)
– Given: Point correspondence (𝐱, 𝐱′), camera matrices (K, K′), and relative pose (R, 𝐭)
– Constraints: 𝐱 = K I | 𝟎 𝐗, 𝐱′ = K′ R | 𝐭 𝐗
– Solution
• OpenCV cv.triangulatePoints()
𝐗
• Special case) Stereo cameras
−𝑏 𝑓 0 𝑐𝑥
R = I3×3 , 𝐭 = 0 , and K = K ′ = 0 𝑓 𝑐𝑦
0 0 0 1
𝑓
image plane
∴𝑍= ′𝑏
𝑥−𝑥
𝐱 𝐱′

camera #1
camera #2

relative pose R, 𝐭

39
A point cloud: data/box.xyz
Triangulation

▪ Example) Triangulation
f, cx, cy = 1000., 320., 240.
pts0 = np.loadtxt('../data/image_formation0.xyz')[:,:2]
pts1 = np.loadtxt('../data/image_formation1.xyz')[:,:2]
output_file = '../data/triangulation.xyz'
CloudCompare
# Estimate relative pose of two view
F, _ = cv.findFundamentalMat(pts0, pts1, cv.FM_8POINT)
K = np.array([[f, 0, cx], [0, f, cy], [0, 0, 1]])
E = K.T @ F @ K
_, R, t, _ = cv.recoverPose(E, pts0, pts1)

# Reconstruct 3D points (triangulation)


P0 = K @ np.eye(3, 4, dtype=np.float32)
Rt = np.hstack((R, t))
P1 = K @ Rt
X = cv.triangulatePoints(P0, P1, pts0.T, pts1.T) 𝐱 =K I|𝟎 𝐗
X /= X[3]
X = X.T 𝐱′ = K′ R | 𝐭 𝐗

output_file: data/triangulation.xyz
# Write the reconstructed 3D points
np.savetxt(output_file, X)

40
Summary

▪ Planar Homography: 𝐱 𝑖′ = H𝐱 𝑖
– Example) Perspective distortion correction
– Example) Planar image stitching
– Example) 2D video stabilization

▪ Epipolar Geometry: 𝐱 ′ F𝐱 = 0 (on the image plane), 𝐱ො ′⊺ Eො𝐱 = 0 (on the normalized image plane)
– Example) Epipolar line visualization
▪ Relative Camera Pose Estimation: Finding R and 𝐭 (5 DOF) R, 𝐭
– Solutions) Fundamental matrix: 7/8-point algorithm (7 DOF) ↕ (E = 𝐭 × R)
E
– Solutions) Essential matrix: 5-point algorithm (5 DOF) ↕ (E = K ′⊺ FK)
– Solutions) Planar homography: 4-point algorithm (8 DOF) F
– Example) Fundamental matrix estimation
– Example) Monocular Visual Odometry (Epipolar Version)
▪ Triangulation: Finding 𝐗 (3 DOF)
– Example) Triangulation

41

You might also like