EEE

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

YOLO Detection and MATLAB Integration Documentation

Anıl Akpınar

220201013

Introduction:
This documentation provides an in-depth explanation of the object detection workflow using the YOLO
(You Only Look Once) algorithm in Python, followed by the integration of detection logs into
MATLAB for visualization. The goal is to detect objects in video frames, save detection data, and
overlay the results onto the video for analysis.

Why YOLO?
YOLO is chosen for object detection due to its real-time performance and accuracy. It divides the
image into a grid and predicts bounding boxes and class probabilities for each grid cell. This single-
stage approach makes YOLO significantly faster than two-stage methods like Faster R-CNN.

Key benefits of YOLO:

Speed: Processes frames in real-time, crucial for live video feeds.


Accuracy: Provides good detection accuracy with fewer false positives.
Simplicity: Combines detection and classification in one network, reducing complexity.
Python Workflow Explanation:
Loading the YOLO Model:
Video Frame Processing:
Each video frame is read and processed into a format suitable for YOLO (blob). The model then
predicts objects in the frame.

import cv2
import numpy as np
import datetime

video_path = '/home/pervane/Videos/Screencasts/Screencast.mp4'

yolo_net = cv2.dnn.readNetFromDarknet('/home/pervane/Projects/YOLO/ipkamera/yolov4-tiny-
custom.cfg', '/home/pervane/Projects/YOLO/ipkamera/yolov4-tiny-custom_final.weights')

layer_names = yolo_net.getLayerNames()
output_layers = [layer_names[i - 1] for i in yolo_net.getUnconnectedOutLayers()]

cap = cv2.VideoCapture(video_path)

log_file = open("/home/pervane/Projects/YOLO/tespitelog.txt", "w")

log_file.write("Detected Objects:\n")

while True:

ret, frame = cap.read()


if not ret:
print("Video bitmiş veya açılmamış.")
break

current_time = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")

blob = cv2.dnn.blobFromImage(frame, 0.00392, (416, 416), (0, 0, 0), True, crop=False)


yolo_net.setInput(blob)
outs = yolo_net.forward(output_layers)

for out in outs:


for detection in out:
scores = detection[5:]
class_id = np.argmax(scores)
confidence = scores[class_id]

if confidence > 0.5:


center_x = int(detection[0] * frame.shape[1])
center_y = int(detection[1] * frame.shape[0])
w = int(detection[2] * frame.shape[1])
h = int(detection[3] * frame.shape[0])
cv2.rectangle(frame, (center_x - w // 2, center_y - h // 2),
(center_x + w // 2, center_y + h // 2), (0, 255, 0), 2)

log_file.write(f"Time: {current_time}, Class ID: {class_id}, Confidence: {confidence:.2f}, "


f"Position: (x: {center_x - w // 2}, y: {center_y - h // 2}), "
f"Width: {w}, Height: {h}\n")

cv2.imshow('YOLO - Nesne Tespiti', frame)

if cv2.waitKey(1) & 0xFF == ord('q'):


break

log_file.close()

cap.release()
cv2.destroyAllWindows()

Object Detection and Bounding Box Extraction:

YOLO outputs bounding boxes, confidence scores, and class IDs. Only detections with confidence
above a threshold (e.g., 0.5) are considered valid.
Log File Generation:
Detected objects' information, including timestamps, class IDs, and bounding box coordinates, is
written to a CSV log file. This log serves as a record for later visualization.

MATLAB Workflow Explanation:


Reading Detection Logs:

The CSV log file created by the Python script is imported into MATLAB. This log contains all the
necessary information to overlay detections onto the video.

data = readtable('/home/pervane/Projects/YOLO/tespitelog.csv', 'VariableNamingRule', 'preserve');

positions_x = str2double(data.("Position X"));


positions_y = str2double(data.("Position Y"));
widths = str2double(data.Width);
heights = str2double(data.Height);

valid_rows = ~isnan(positions_x) & ~isnan(positions_y) & ~isnan(widths) & ~isnan(heights);


positions_x = positions_x(valid_rows);
positions_y = positions_y(valid_rows);
widths = widths(valid_rows);
heights = heights(valid_rows);
class_ids = data.("Class ID")(valid_rows);
confidences = data.Confidence(valid_rows);

figure;
hold on;
axis equal;
title('YOLO Detections');
xlabel('X');
ylabel('Y');
for i = 1:length(positions_x)
rectangle('Position', [positions_x(i), positions_y(i), widths(i), heights(i)], ...
'EdgeColor', 'r', 'LineWidth', 2);
text(positions_x(i), positions_y(i) - 10, ...
sprintf('Class: %d, Conf: %.2f', class_ids(i), confidences(i)), ...
'Color', 'blue', 'FontSize', 8);
end

hold off;

Visualization Issues:

An issue was encountered where bounding boxes did not align perfectly with video frames. This could
be due to timing mismatches or incorrect frame-rate assumptions.
Solution Considerations:

Ensure that the log timestamps are synchronized with the video frames.
Verify that the bounding box coordinates match the video resolution.

Conclusion:
This workflow demonstrates a complete pipeline from object detection using YOLO in Python to
visualizing the results in MATLAB. Although the detection process was successful, further refinement
is needed in MATLAB to achieve perfect frame synchronization and accurate bounding box overlays.
Future improvements may include using Python for visualization to maintain consistency across the
process.

You might also like