0% found this document useful (0 votes)

30 views

Happymonk Data Engineer Intern Assignment

This code implements a real-time video analytics pipeline that processes video frames, applies analytics, and stores results in a database. It defines a Camera class to ingest video streams and extract frame information. Multiple camera streams are processed concurrently by creating a thread for each. Frame details are written to JSON files and a SQLite database. The code also includes functions to retrieve relevant batches and frames based on a timestamp, and create a video file from the retrieved frames.

Uploaded by

Jayakanth

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

Happymonk Data Engineer Intern Assignment

Uploaded by

Jayakanth

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Happymonk Data Engineer intern Assignment

Task 1:

Source Code

import cv2
import json
import time
import threading
from datetime import datetime
import sqlite3

# Create a connection to the database

conn = sqlite3.connect('video_data.db')
c = conn.cursor()

# Create table
c.execute('''CREATE TABLE batches
(batch_id text, starting_frame_id text, ending_frame_id text,
timestamp text)''')

# Save (commit) the changes

conn.commit()

# Camera class
class Camera:
def __init__(self, camera_id, geo_location, video_path):
self.camera_id = camera_id
self.geo_location = geo_location
self.video_path = video_path
self.frame_id = 0

def process_frame(self, frame):

# Process the frame and extract information
# For simplicity, let's just write the frame to an image file
image_path = f"{self.camera_id}_{self.frame_id}.jpg"
cv2.imwrite(image_path, frame)
return {
"camera_id": self.camera_id,
"frame_id": self.frame_id,
"geo_location": self.geo_location,
"image_path": image_path,
}

def ingest_stream(self):
cap = cv2.VideoCapture(self.video_path)
while(cap.isOpened()):
ret, frame = cap.read()
if ret:
info = self.process_frame(frame)
self.frame_id += 1
yield info
else:
break
cap.release()

# Function to handle each camera

def handle_camera(camera):
for info in camera.ingest_stream():
# Write info to json file
with open(f"{camera.camera_id}.json", "a") as f:
json.dump(info, f)
f.write("\n")

# Write info to database

c.execute("INSERT INTO batches VALUES (?, ?, ?, ?)",
(info["camera_id"], info["frame_id"], info["frame_id"],
str(datetime.now())))

# Commit the changes

conn.commit()

time.sleep(1) # Sleep for 1 second

# List of cameras
cameras = [
Camera("camera1", "location1", "video1.mp4"),
Camera("camera2", "location2", "video2.mp4"),
]

# Create and start a thread for each camera

threads = []
for camera in cameras:
thread = threading.Thread(target=handle_camera, args=(camera,))
thread.start()
threads.append(thread)

# Wait for all threads to finish

for thread in threads:
thread.join()

# Close the connection to the database

conn.close()
This code does the following:

1. It defines a Camera class that simulates the ingestion of a live video stream and processes each
frame.

2. It creates a separate thread for each camera to handle multiple streams concurrently.

3. It writes the information of each frame to a JSON file and a SQLite database.

Detailed word document

Real-Time Video Analytics Pipeline

This application is designed to perform real-time analytics on live video streams. It processes video
frames in real-time, applies analytics, and stores the results in a database. The application is built in
Python and uses libraries like OpenCV for video processing and SQLite for data storage.This
application is designed to perform real-time analytics on live video streams. It processes video
frames in real-time, applies analytics, and stores the results in a database. The application is built in
Python and uses libraries like OpenCV for video processing and SQLite for data storage.

Components
The application consists of the following components:
1. Video Stream Ingestion: This component simulates the ingestion of a live video
stream. It uses OpenCV to read video frames from a video file and treats it as a live
stream. It captures frames continuously from the source.
2. Frame Processing: This component takes each incoming video frame and performs
the following actions:
o Processes the frame and creates a JSON object for each frame.
o Extracts relevant information from the processed frame. The JSON object
contains the camera ID, frame ID, geo-location, and the path to the image
file.
o Writes one frame per second as an image file.
3. Batching: This component performs batching of the processed frames based on the
duration value specified in the config file. It creates a dictionary for every batch that
consists of the batch ID, starting frame ID, ending frame ID, and timestamp.
4. Data Storage: This component stores the batch information in a SQLite database. It
creates necessary tables and columns to store the batch information. Every batch
information is logged in the database.
5. Error Handling and Logging: This component implements error handling and logging
mechanisms to capture and handle exceptions that may occur during frame
processing, data storage, or transmission. It ensures that the application logs
relevant information for debugging.
6. Concurrency and Performance: This component modifies the application to handle
multiple camera streams concurrently. It ensures thread safety and avoids race
conditions.

Usage
To use the application, you need to create a Camera object for each camera stream. You
need to provide the camera ID, geo-location, and the path to the video file. Then, you can
start the application by creating and starting a thread for each camera.

Here’s an example:
cameras = [
Camera("camera1", "location1", "video1.mp4"),
Camera("camera2", "location2", "video2.mp4"),
]

threads = []
for camera in cameras:
thread = threading.Thread(target=handle_camera, args=(camera,))
thread.start()
threads.append(thread)

for thread in threads:

thread.join()

This will start the application and it will begin processing the video streams. The results will
be stored in a SQLite database and a JSON file.
Task 2:

Source Code:
import cv2
import json
import time
import threading
from datetime import datetime
import sqlite3

# Create a connection to the database

conn = sqlite3.connect('video_data.db')
c = conn.cursor()

# Function to get batch information from the database

def get_batches(timestamp, duration):
# Convert timestamp to datetime object
timestamp = datetime.strptime(timestamp, "%Y-%m-%d %H:%M:%S")

# Get batches that start within the specified duration after the timestamp
c.execute('''
SELECT * FROM batches
WHERE timestamp BETWEEN ? AND ?
''', (str(timestamp), str(timestamp + timedelta(seconds=duration))))

return c.fetchall()

# Function to get frame information from the JSON file

def get_frames(batch):
frames = []

# Open the JSON file

with open(f"{batch[0]}.json", "r") as f:
# Read the file line by line (each line is a JSON object)
for line in f:
# Parse the JSON object
frame = json.loads(line)

# Check if the frame ID is within the batch

if batch[1] <= frame["frame_id"] <= batch[2]:
frames.append(frame)

return frames

# Function to create a video file from the frames

def create_video(frames):
# Open the first frame to get the size
img = cv2.imread(frames[0]["image_path"])
height, width, layers = img.shape

# Create a VideoWriter object

video = cv2.VideoWriter('output.mp4', cv2.VideoWriter_fourcc(*'mp4v'), 25,
(width, height))

# Write the frames to the video file

for frame in frames:
img = cv2.imread(frame["image_path"])
video.write(img)

# Release the VideoWriter object

video.release()

# Function to handle user input

def handle_user_input():
# Get user input
timestamp = input("Enter the timestamp (YYYY-MM-DD HH:MM:SS): ")
duration = int(input("Enter the duration of the video file (in seconds):
"))

# Get the batches from the database

batches = get_batches(timestamp, duration)

# Get the frames from the JSON file and create a video file for each batch
for batch in batches:
frames = get_frames(batch)
create_video(frames)

# Start the application

handle_user_input()

# Close the connection to the database

conn.close()

This code does the following:

1. It defines several functions to get batch information from the database, get frame
information from a JSON file, create a video file from the frames, and handle user
input.
2. It starts the application by calling the handle_user_input function, which gets user
input, retrieves the relevant batches and frames, and creates a video file for each
batch.

Real-Time Video Analytics Pipeline

Components
The application consists of the following components:
1. User Input: This component handles user input. It prompts the user to enter a
timestamp and the duration of the video file.
2. Batch Retrieval: This component retrieves batch information from the database. It
gets all batches that start within the specified duration after the timestamp.
3. Frame Retrieval: This component retrieves frame information from a JSON file. It
reads the JSON file line by line and checks if the frame ID is within the batch.
4. Video Creation: This component creates a video file from the frames. It opens the
first frame to get the size, creates a VideoWriter object, and writes the frames to the
video file.

Usage
To use the application, you just need to run the script. It will prompt you to enter a
timestamp and the duration of the video file. You should enter the timestamp in the format
“YYYY-MM-DD HH:MM:SS” and the duration in seconds. The application will then retrieve
the relevant batches and frames and create a video file for each batch.
Here’s an example:
Enter the timestamp (YYYY-MM-DD HH:MM:SS): 2023-12-05 10:32:37
Enter the duration of the video file (in seconds): 60

This will start the application and it will begin processing the video streams. The results will
be stored in a SQLite database and a JSON file.

Az 104
No ratings yet
Az 104
21 pages
DCS Software Backup Procedure
100% (2)
DCS Software Backup Procedure
19 pages
Video Servellinece
No ratings yet
Video Servellinece
11 pages
Data Engineer Intern Assignment
No ratings yet
Data Engineer Intern Assignment
3 pages
Prompt and Response for the Google form
No ratings yet
Prompt and Response for the Google form
6 pages
Second Modulenotespython
No ratings yet
Second Modulenotespython
16 pages
AhanaBasu 11500120098 Grp-2 Mid-Term-Project-Evaluation REPORT
No ratings yet
AhanaBasu 11500120098 Grp-2 Mid-Term-Project-Evaluation REPORT
9 pages
app Explanation Request
No ratings yet
app Explanation Request
29 pages
CN Mini Project
No ratings yet
CN Mini Project
8 pages
H.264 Tutorial YT1-1627-000
No ratings yet
H.264 Tutorial YT1-1627-000
32 pages
instructions 2
No ratings yet
instructions 2
7 pages
OpenCV Tutorial
No ratings yet
OpenCV Tutorial
18 pages
Ajax Video Player Javascript
No ratings yet
Ajax Video Player Javascript
8 pages
Installing Opencvdotnet: Install
No ratings yet
Installing Opencvdotnet: Install
10 pages
Apuntes Opencv
No ratings yet
Apuntes Opencv
8 pages
EE125 Lab 4: Image Manipulation & Camera Calibration
No ratings yet
EE125 Lab 4: Image Manipulation & Camera Calibration
9 pages
deepfakeDetectionGLSTM (1)
No ratings yet
deepfakeDetectionGLSTM (1)
10 pages
Untitled document
No ratings yet
Untitled document
4 pages
Exp.2
No ratings yet
Exp.2
8 pages
Object Tracking OpenCV
No ratings yet
Object Tracking OpenCV
4 pages
Web Engineering Laboratory 3: Web2py Database Access
No ratings yet
Web Engineering Laboratory 3: Web2py Database Access
12 pages
Image Acquisition Toolbox
No ratings yet
Image Acquisition Toolbox
54 pages
Smart Security Camera Using Raspberry PI
No ratings yet
Smart Security Camera Using Raspberry PI
16 pages
Design Proposal
No ratings yet
Design Proposal
7 pages
Lab4 SCAV
No ratings yet
Lab4 SCAV
9 pages
Display Videos in ASP
No ratings yet
Display Videos in ASP
11 pages
TextToVideo
No ratings yet
TextToVideo
3 pages
CCTV Anomaly Detection Guide
No ratings yet
CCTV Anomaly Detection Guide
39 pages
Code1 Praveen
No ratings yet
Code1 Praveen
3 pages
Software Engineer Internship Task
No ratings yet
Software Engineer Internship Task
7 pages
Cuttingtxt
No ratings yet
Cuttingtxt
7 pages
Ai Lab 02
No ratings yet
Ai Lab 02
12 pages
Contour Based Tracking
No ratings yet
Contour Based Tracking
20 pages
22BIT0518_VL2023240503986_DA02
No ratings yet
22BIT0518_VL2023240503986_DA02
20 pages
Sidra Object Detection and Tracking Not Copyable1
No ratings yet
Sidra Object Detection and Tracking Not Copyable1
43 pages
Open CV
No ratings yet
Open CV
48 pages
Cameramodule
No ratings yet
Cameramodule
12 pages
Camera Capture: How To Integaration and Use
No ratings yet
Camera Capture: How To Integaration and Use
7 pages
4.2.2
No ratings yet
4.2.2
3 pages
Pythonreference RST
No ratings yet
Pythonreference RST
16 pages
Compare Two Images
0% (1)
Compare Two Images
3 pages
التجربة التالية ااااا
No ratings yet
التجربة التالية ااااا
7 pages
Project On Cyber Security (1)
No ratings yet
Project On Cyber Security (1)
8 pages
Code Analysis
No ratings yet
Code Analysis
10 pages
Object Detection and Tracking Using Opencv in Python: March 2020
No ratings yet
Object Detection and Tracking Using Opencv in Python: March 2020
43 pages
21bcs9861 Abhishek
No ratings yet
21bcs9861 Abhishek
4 pages
CV Record
No ratings yet
CV Record
48 pages
Real-Time Barcode Scanning From Camera Stream - CodeProject
No ratings yet
Real-Time Barcode Scanning From Camera Stream - CodeProject
12 pages
Advanced Keylogger with Webcam and Microphone Spy
No ratings yet
Advanced Keylogger with Webcam and Microphone Spy
92 pages
CS 8581-Lab-Manual
No ratings yet
CS 8581-Lab-Manual
39 pages
W22 Answer For Gtu
No ratings yet
W22 Answer For Gtu
33 pages
project_naanmudhalvan
No ratings yet
project_naanmudhalvan
3 pages
Chapter 21: Videophone Developing Multimedia Apps With JMF / TGV
No ratings yet
Chapter 21: Videophone Developing Multimedia Apps With JMF / TGV
4 pages
Assignment3-Cloud-Storage (2)
No ratings yet
Assignment3-Cloud-Storage (2)
23 pages
Experiment:3.2: Relevance of The Experiment
No ratings yet
Experiment:3.2: Relevance of The Experiment
8 pages
CSDF
No ratings yet
CSDF
17 pages
Project (1) (2)
No ratings yet
Project (1) (2)
3 pages
Engineering Analysis and Design - ECN 206 Matlab Project Title
No ratings yet
Engineering Analysis and Design - ECN 206 Matlab Project Title
19 pages
SCREENCAPTURE
No ratings yet
SCREENCAPTURE
4 pages
IPCV
No ratings yet
IPCV
26 pages
Firebase Storage for Angular: A reliable file upload solution for your applications
From Everand
Firebase Storage for Angular: A reliable file upload solution for your applications
Abdelfattah Ragab
No ratings yet
PHP Package Mastery: 100 Essential Tools in One Hour - 2024 Edition
From Everand
PHP Package Mastery: 100 Essential Tools in One Hour - 2024 Edition
Kanto
No ratings yet
Basic CREATE TABLE Mysql Statement
No ratings yet
Basic CREATE TABLE Mysql Statement
3 pages
Stacks and Queues
No ratings yet
Stacks and Queues
70 pages
Lecture Sessional 1
No ratings yet
Lecture Sessional 1
67 pages
Practical Assignment MSC-II
No ratings yet
Practical Assignment MSC-II
6 pages
Ravi Obiee Resume
No ratings yet
Ravi Obiee Resume
3 pages
The Hadoop Distributed File System
No ratings yet
The Hadoop Distributed File System
29 pages
Data Base Management Systems - Lab 2ND SEM BCA - Y2K8 SCHEME
No ratings yet
Data Base Management Systems - Lab 2ND SEM BCA - Y2K8 SCHEME
8 pages
Linux Disk Management & Formatting
No ratings yet
Linux Disk Management & Formatting
3 pages
B6061 David Lashbrook Course Guide
No ratings yet
B6061 David Lashbrook Course Guide
397 pages
Final Year Project Report 2
No ratings yet
Final Year Project Report 2
96 pages
Lecture 5 - MapReduce
No ratings yet
Lecture 5 - MapReduce
43 pages
IP class XII holiday H.W(WS)
No ratings yet
IP class XII holiday H.W(WS)
8 pages
BLP Unit 3.3
No ratings yet
BLP Unit 3.3
11 pages
Chapter 5 of PHP (WBP)
No ratings yet
Chapter 5 of PHP (WBP)
25 pages
Question Bank For PUT
No ratings yet
Question Bank For PUT
3 pages
Forms Tutorial
No ratings yet
Forms Tutorial
37 pages
Attachment Note 1466740
No ratings yet
Attachment Note 1466740
8 pages
Unit-1 DBMS
No ratings yet
Unit-1 DBMS
82 pages
Monteverde
No ratings yet
Monteverde
16 pages
Storage Devices: Jyoti Madabhushi
No ratings yet
Storage Devices: Jyoti Madabhushi
2 pages
Goodrich, P - Erotic Melancholia, (2002) 141 Law & Literature 103
No ratings yet
Goodrich, P - Erotic Melancholia, (2002) 141 Law & Literature 103
28 pages
Stata: Do-Files and Log-Files
No ratings yet
Stata: Do-Files and Log-Files
4 pages
Ashik CyberArk CV
No ratings yet
Ashik CyberArk CV
4 pages
Soscommand
No ratings yet
Soscommand
10 pages
DD Boost Everywhere - File System Plug-In Integration Guide Oct 2017
No ratings yet
DD Boost Everywhere - File System Plug-In Integration Guide Oct 2017
31 pages
Beno K Pradekso - Solusi247 - In40ai
No ratings yet
Beno K Pradekso - Solusi247 - In40ai
36 pages
5 - Linked List C++
No ratings yet
5 - Linked List C++
27 pages
1.3 PPT - Measure of Query Cost
100% (1)
1.3 PPT - Measure of Query Cost
42 pages