Open In App

Person and Face Detection using Intel OpenVINO toolkit

Last Updated : 04 Oct, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

The OpenVINO Toolkit by Intel is a robust platform aimed at assisting developers in speeding up the implementation of deep learning models for computer vision activities. It enhances models for Intel hardware, such as CPUs, GPUs, VPUs, and FPGAs, enabling effective inference on edge devices. The toolkit is compatible with different deep learning frameworks such as TensorFlow, Caffe, and PyTorch, and offers pre-trained models like person and face detection models, which is perfect for real-time applications.

Overview of Person and Face Detection

Person and face detection are crucial components of many AI-driven applications, ranging from security systems to social media platforms. These tasks involve identifying and localizing human faces and bodies in images or videos. The OpenVINO toolkit provides pre-trained models that can be used for these purposes, enabling developers to implement robust detection systems with minimal effort.

Key Features of OpenVINO for Detection Tasks:

  • Model Optimization: OpenVINO's Model Optimizer converts models trained in popular frameworks like TensorFlow and PyTorch into an Intermediate Representation (IR) format (.xml and .bin files), which is optimized for inference on Intel hardware.
  • Inference Engine: This component executes the optimized models across different devices, ensuring efficient use of resources while maintaining accuracy.
  • Pre-trained Models: OpenVINO offers a variety of pre-trained models for object detection, including those specifically designed for face and person detection

Setting Up OpenVINO Toolkit for Detection

In order to begin person and face detection with OpenVINO, you must first install the toolkit and configure your environment.

1. Set up OpenVINO:

Obtaining and setting up the OpenVINO Toolkit. Adhere to the installation guidelines specific to your OS.

2. Set up the Essential Environment Variables:

By Executing the setup script to configure the Environment.

source /opt/intel/openvino/bin/setupvars.sh

3. Get pre-trained models:

by using the OpenVINO Model Downloader tool to obtain models already trained for person and face detection.

python <OPENVINO_INSTALL_DIR>/deployment_tools/tools/model_downloader/downloader.py --name person-detection-retail-0013, face-detection-retail-0004

4. Transform models into Intermediate Representation (IR):

Models need to be converted to OpenVINO's Intermediate Representation (IR) format through the Model Optimizer.

python <OPENVINO_INSTALL_DIR>/deployment_tools/model_optimizer/mo.py --input_model <path_to_model>

Performing Detection Using the OpenVINO Inference Engine

After setting up the environment and downloading the models, you can perform detection using the OpenVINO Inference Engine. The following are the steps:

1. Load the models for person and face detection into the Inference Engine to initialize the Network.

Python
from openvino.inference_engine import IECore
ie = IECore()
person_net = ie.read_network(model="person-detection-retail-0013.xml", weights="person-detection-retail-0013.bin")
face_net = ie.read_network(model="face-detection-retail-0004.xml", weights="face-detection-retail-0004.bin")

2. Prepare Input: Resize input images to match the model's expected input size before processing them.

Python
input_blob = next(iter(person_net.input_info))
n, c, h, w = person_net.input_info[input_blob].input_data.shape
image = cv2.resize(image, (w, h))

3. Execute Inference: Conduct inference utilizing the preprocessed image.

Python
exec_net = ie.load_network(network=person_net, device_name="CPU")
result = exec_net.infer(inputs={input_blob: image})

Running Inference on Detection Models

Running the model on an image or video feed and getting detection results is part of the process of inference. When detecting persons and faces, the result will show bounding boxes surrounding the identified individuals and faces. The following is a guide on how to retrieve outcomes from the inference:

Python
for detection in result[0][0]: 
    if detection[2] > 0.5: # Confidence threshold
        xmin, ymin, xmax, ymax = int(detection[3] * w), int(detection[4] * h), int(detection[5] * w), int(detection[6] * h)
        cv2.rectangle(image, (xmin, ymin), (xmax, ymax), (0, 255, 0), 2)

This code creates borders around identified objects with a confidence level exceeding 50%.

Face Detection using Intel OpenVINO

Listed below are the procedures and code required to carry out the task of detecting individuals and faces.

Step 1: Import Libraries

Python
import cv2
# OpenCV library for image processing, including reading, resizing, and displaying images.

import numpy as np
# NumPy library for handling arrays and numerical operations, used for image manipulation.

from openvino.runtime import Core, Tensor
# OpenVINO runtime for model inference, providing tools to load models and create tensor representations.

Step 2: Load the Image

Python
image_path = "/content/OIG2.jpg"  # Replace with your image path
image = cv2.imread(image_path)

# Check if the image was loaded successfully
if image is None:
    raise FileNotFoundError(f"Image not found at path: {image_path}")

Step 3: Resize the Image for Person Detection

Python
resized_image_person = cv2.resize(image, (544, 320))  # OpenVINO models expect (width, height)

resized_image_person = resized_image_person.transpose(2, 0, 1)  # Shape: (3, 320, 544)
resized_image_person = np.expand_dims(resized_image_person, axis=0)  # Shape: (1, 3, 320, 544)
resized_image_person = resized_image_person.astype(np.float32)  # Ensure the data type is correct

Step 4: Initialize OpenVINO and Load the Person Detection Model

Python
core = Core()
person_model_path = "/content/models/intel/person-detection-retail-0013/FP32/person-detection-retail-0013.xml"  # Replace with the correct path
person_model = core.read_model(model=person_model_path)
compiled_person_model = core.compile_model(model=person_model, device_name="CPU")
person_infer_request = compiled_person_model.create_infer_request()

Step 5: Create OpenVINO Tensor for Person Detection

Python
person_input_tensor = Tensor(resized_image_person)  # Create OpenVINO tensor from numpy array
person_infer_request.set_input_tensor(person_input_tensor)  # Set the input tensor

person_detection_result = person_infer_request.infer()

# Check the available keys in the person detection result
print("Person Detection Result Keys:", person_detection_result)

Output:

Person Detection Result Keys: {<ConstOutput: names[detection_out] shape[1,1,200,7] type: f32>: array([[[[0.        , 1.        , 0.14605983, ..., 0.26581013,
0.3791199 , 0.9395021 ],
[0. , 1. , 0.11663498, ..., 0.00169784,
0.86616117, 0.91260153],
[0. , 1. , 0.11560795, ..., 0.06234652,
0.5887274 , 1.0231483 ],
...,
[0. , 0. , 0. , ..., 0. ,
0. , 0. ],
[0. , 0. , 0. , ..., 0. ,
0. , 0. ],
[0. , 0. , 0. , ..., 0. ,
0. , 0. ]]]], dtype=float32)}

Step 6: Resize the Image and Load the Face Detection Model

Python
resized_image_face = cv2.resize(image, (300, 300))  # OpenVINO models expect (width, height)

# Prepare the image for inference (face detection)
resized_image_face = resized_image_face.transpose(2, 0, 1)  # Shape: (3, 300, 300)
resized_image_face = np.expand_dims(resized_image_face, axis=0)  # Shape: (1, 3, 300, 300)
resized_image_face = resized_image_face.astype(np.float32)  # Ensure the data type is correct

face_model_path = "/content/models/intel/face-detection-retail-0004/FP32/face-detection-retail-0004.xml"  # Replace with the correct path
face_model = core.read_model(model=face_model_path)
compiled_face_model = core.compile_model(model=face_model, device_name="CPU")
face_infer_request = compiled_face_model.create_infer_request()

Step 7: Create OpenVINO Tensor for Face Detection

Python
face_input_tensor = Tensor(resized_image_face)  # Create OpenVINO tensor from numpy array
face_infer_request.set_input_tensor(face_input_tensor)  # Set the input tensor

face_detection_result = face_infer_request.infer()

# Check the available keys in the face detection result
print("Face Detection Result Keys:", face_detection_result)

Output:

Face Detection Result Keys: {<ConstOutput: names[detection_out] shape[1,1,200,7] type: f32>: array([[[[0.        , 1.        , 0.99999857, ..., 0.07185134,
0.60832304, 0.33008844],
[0. , 1. , 0.34375605, ..., 0.16192132,
0.7964552 , 0.33839393],
[0. , 1. , 0.22114603, ..., 0.7095434 ,
0.16954124, 0.7918562 ],
...,
[0. , 0. , 0. , ..., 0. ,
0. , 0. ],
[0. , 0. , 0. , ..., 0. ,
0. , 0. ],
[0. , 0. , 0. , ..., 0. ,
0. , 0. ]]]], dtype=float32)}

Step 8: Post-processing and Display Results

Python
def display_detections(image, detections, threshold=0.5):
    height, width = image.shape[:2]
    for detection in detections[0][0]:  # Usually detections are nested in this format (batch, detections)
        confidence = detection[2]
        if confidence > threshold:
            xmin = int(detection[3] * width)
            ymin = int(detection[4] * height)
            xmax = int(detection[5] * width)
            ymax = int(detection[6] * height)
            cv2.rectangle(image, (xmin, ymin), (xmax, ymax), (0, 255, 0), 2)
    return image

# Draw bounding boxes on the original image
output_image = image.copy()  # Make a copy of the original image to draw on

# Access the output using the key directly
for person_key in person_detection_result.keys():
    person_detections = person_detection_result[person_key]

for face_key in face_detection_result.keys():
    face_detections = face_detection_result[face_key]

# Draw person detections
output_image = display_detections(output_image, person_detections)
# Draw face detections
output_image = display_detections(output_image, face_detections)

# Display or save the output image with detections
cv2.imshow("Person and Face Detections", output_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

# Optionally, you can save the output image
cv2.imwrite("output_with_detections.jpg", output_image)

Output:

Screenshot-2024-09-28-150646
Person and Face Detection using Intel OpenVINO toolkit

Optimizing Detection Models for Performance

OpenVINO offers various optimization methods to enhance performance, particularly on edge devices.

  1. Convert models to FP16 precision for deployment on edge devices such as Intel Movidius NCS to speed up computation without a noticeable decrease in accuracy.
  2. Batch Size Optimization: Modify the batch size to enhance the speed of inference while handling several images simultaneously.
  3. CPU Extensions: Utilize CPU extensions from OpenVINO to optimize performance on Intel CPUs.

Deploying Detection Models on Edge Devices

OpenVINO excels in its capability to implement models on edge devices such as Intel CPUs and VPUs. Deployment consists of moving the optimized model and inference script to the device, where real-time inference can be performed. The Intel Movidius Neural Compute Stick (NCS) is a common choice for implementing OpenVINO models on the edge.

  1. Move the IR files and Python script to the edge device for transfer model and code.
  2. Perform Inference on Edge Device: Make sure the OpenVINO runtime environment is correctly configured on the edge device, then conduct inference just like you would on your development machine.

Evaluating and Fine-Tuning the Detection Results

Once the model is implemented, it is crucial to assess its effectiveness and adjust it as needed. These steps can enhance the accuracy of detection.

  1. Modify the Confidence Threshold: Tweaking the confidence threshold (like 0.5 or 0.6) can enhance detection outcomes by eliminating predictions with low confidence.
  2. Non-Maximum Suppression (NMS): Utilize NMS to decrease overlapping detections and enhance bounding boxes, especially when there are numerous detections nearby.
  3. Post-Processing: To enhance tracking stability, post-processing methods like smoothing bounding box coordinates can be implemented across frames in a video.

Conclusion

The Intel OpenVINO Toolkit provides a robust platform for implementing person and face detection models on Intel devices, particularly for edge computing. OpenVINO is a great option for real-time computer vision applications due to its optimization tools, pre-trained models, and easy deployment on CPUs, VPUs, and FPGAs. Whether you're utilizing person detection for surveillance or face detection for retail analytics, OpenVINO guarantees top-notch efficiency and adaptability for various deep learning-based detection assignments.


Article Tags :
Practice Tags :

Similar Reads