Documentation ML
Documentation ML
Team Name
Data_Dynamo_
Anisha Roy
Ayushi Singh
Ayushi Vardhan
Table of Contents
1. Requirements
2. Assumptions
3. Functions
4. download_random_sample_images
5. extract_text_from_image
6. extract_entity
7. make_predictions
8. Usage
9. How it Works
10.Modifications
Requirements
Assumptions
CSV file: The input dataset test.csv has the following structure:
Functions
import pandas as pd
import pytesseract
import cv2
import requests
import numpy as np
import re
# Display the first few rows of the data to understand its structure
print("Train Data:")
print(train_df.head())
print("Test Data:")
print(test_df.head())
Explanation:
train.csv and test.csv: The training and testing datasets containing
image URLs and other relevant columns. These CSV files are loaded into
pandas DataFrames.
This prints out the first few rows of both datasets to inspect the data
structure.
2. Image Downloading
def download_image(url):
try:
response = requests.get(url)
img = np.array(bytearray(response.content), dtype=np.uint8)
img = cv2.imdecode(img, -1)
if img is None:
raise ValueError("Image could not be decoded")
print(f"Image downloaded successfully: {url}")
print(f"Image shape: {img.shape}")
return img
except Exception as e:
print(f"Error downloading or decoding image: {e}")
return None
Explanation:
Purpose: Downloads an image from a given URL and decodes it into a
format that can be processed by OpenCV.
Error Handling: If the image cannot be downloaded or decoded, an
exception is caught and an error message is printed.
Return: Returns the image as a NumPy array or None if the download
fails.
def extract_text_from_image(url):
image = download_image(url)
if image is not None:
try:
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
extracted_text = pytesseract.image_to_string(gray)
return extracted_text
except Exception as e:
print(f"Error in OCR processing: {e}")
return ""
Explanation:
Input: Takes the URL of an image.
Download and Preprocessing: The image is downloaded and
converted to grayscale using OpenCV, which helps improve the accuracy
of the OCR process.
OCR: The grayscale image is passed to Tesseract-OCR to extract text.
Error Handling: Any OCR processing issues are caught and logged.
Return: Returns the extracted text from the image or an empty string
if an error occurs.
patterns = {
'item_weight': r'(\d+\.?\d*)\s*(gram|kilogram|ounce|pound)',
'item_volume': r'(\d+\.?\d*)\s*(milliliter|liter|gallon)',
'item_dimension': r'(\d+\.?\d*)\s*(centimetre|meter|inch)',
# Add more patterns as needed
}
Explanation:
Patterns: A dictionary of regular expressions for extracting specific
entities such as weight, volume, or dimensions from text.
Example: For weight, the pattern looks for a number followed by a
weight unit like "gram", "kilogram", "ounce", or "pound".
Functionality: The extract_entity function uses these patterns to
search for the relevant entity in the extracted text.
Return: If a match is found, it returns the extracted entity value;
otherwise, it returns an empty string.
import random
output_df = pd.DataFrame({
'index': test_df['index'],
'prediction': predictions_padded
})
Explanation:
The code imports the random module, sets a sample size, and generates
random indices. It then loops through each index, extracts the image
link, extracts text from the image, extracts the entity, and appends the
prediction to a list. The final list contains extracted entities for each
sampled row
6. Saving Predictions to CSV
import pandas as pd
import numpy as np
# Calculate F1 score
def calculate_f1_score(gt, out):
true_positives = 0
false_positives = 0
false_negatives = 0
true_negatives = 0
for i in range(len(gt)):
if out[i] != "" and gt[i] != "" and out[i] == gt[i]:
true_positives += 1
elif out[i] != "" and gt[i] != "" and out[i] != gt[i]:
false_positives += 1
elif out[i] != "" and gt[i] == "":
false_positives += 1
elif out[i] == "" and gt[i] != "":
false_negatives += 1
elif out[i] == "" and gt[i] == "":
true_negatives += 1
return f1_score
actual_entities = test_df['actual_entity'].tolist()
f1_score = calculate_f1_score(actual_entities, predictions)
print(f"F1 Score: {f1_score:.4f}")
Usage
Download a random sample of images:
sampled_test_df = download_random_sample_images(test_df,
sample_size=100)
Generate predictions for the sampled images:
make_predictions(sampled_test_df)
How it Works
DataFrame Creation: A new DataFrame is created to store the
predictions along with the corresponding index from the test dataset.
Summary
The results are saved in a CSV file, making this process useful for bulk
OCR and entity extraction tasks from images.