0% found this document useful (0 votes)
31 views49 pages

Saketha PDF

Uploaded by

Saketha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views49 pages

Saketha PDF

Uploaded by

Saketha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Report of the Summer Internship

ON

Artificial Intelligence and Machine Learning

AT

Company Name: AICTE EDUSKILLS – GOOGLE ACADEMY

Duration: 5th April 2024 – 25th June 2024

BY

Mr. Saketha Rama Tummalacherla

(Roll No. 2451-22-750-018)

Department of Computer Science and Engineering

M.V.S.R. ENGINEERING COLLEGE

(An Autonomous Institution)

(Affiliated to Osmania University & Recognized by AICTE)

Nadergul, Saroor Nagar Mandal, Hyderabad – 501 510

2024-2025

i
CERTIFICATE

ii
DECLARATION

This is to certify that the Summer Internship entitled Artificial Intelligence and Machine

Learning Virtual Internship is a record of Bonafide work done by us as part of

internship in the Google Academy. The report is based on the project work done entirely

by us and not copied from any other source.

Signature of the student

Saketha rama tummalacherla

2451-22-750-018

iii
ACKNOWLEDGEMENT

I would like to express my gratitude to all the people behind the screen who helped me to transform an

idea into a real application.

I would like to thank K. Padma , Asst Professor, Department of Computer Science and Engineering for

her technical guidance, constant encouragement and support in carrying out my project at college.

I profoundly thank J. Prasanna Kumar, Head of the Department of Computer Science and Engineering

who has been an excellent guide and also a great source of inspiration to my work.

I would like to express my heart-felt gratitude to my parents without whom I would not have been

privileged to achieve and fulfil my dreams. I am grateful to our principal Dr. Vijaya Gunturu who most

ably run the institution and has had the major hand in enabling me to do my project.

The satisfaction and euphoria that accompany the successful completion of the task would be great but

incomplete without the mention of the people who made it possible with their constant guidance and

encouragement crowns all the efforts with success. In this context, I would like thank all the other staff

members, both teaching and non-teaching, who have extended their timely help and eased my task.

iv
VISION AND MISSION

VISION

• To impart technical education of the highest standards, producing competent and confident

engineers with an ability to use computer science knowledge to solve societal problems.

MISSION

• To make learning process exciting, stimulating and interesting.

• To impart adequate fundamental knowledge and soft skills to students.

• To expose students to advanced computer technologies in order to excel in engineering

practices by bringing out the creativity in students.

• To develop economically feasible and socially acceptable software.

PROGRAM EDUCATIONAL OBJECTIVES (PEOs)

The Bachelor’s program in Computer Science and Engineering is aimed at preparing graduates

who will:

PEO-1: Achieve recognition through demonstration of technical competence for successful

execution of software projects to meet customer business objectives.

v
PEO-2: Practice life-long learning by pursuing professional certifications, higher education or

research in the emerging areas of information processing and intelligent systems at a global level.

PEO-3: Contribute to society by understanding the impact of computing using a multidisciplinary

and ethical approach.

vi
Program Outcomes (PO’s)

1.Engineering Knowledge: Apply knowledge of mathematics, science, engineering

fundamentals, and computer science to solve complex engineering problems.

2. Problem Analysis: Identify, formulate, and analyse real-world problems using

research-based knowledge and principles of computer science.

3.Design and Development of Solutions: Design and develop software solutions and

systems to meet specified needs, considering public health, safety, and environmental

concerns.

4.Investigation of Complex Problems: Use research-based knowledge to investigate

complex problems, design experiments, and interpret data to reach valid conclusions.

5.Modern Tool Usage: Select and apply modern tools, techniques, and resources for

computing and engineering activities, understanding their limitations.

6.The Engineer and Society: Assess societal, legal, health, safety, and cultural issues

relevant to professional engineering practice.

7.Environment and Sustainability: Understand the role of computing solutions in

sustainable development and their impact on the environment and society.

8.Ethics: Commit to professional ethics, responsibilities, and norms of computer

science and engineering practices.

9.Individual and Team Work: Function effectively as an individual, team member, or


vii
leader in multidisciplinary settings.

10.Communication: Communicate effectively on complex engineering activities

through reports, documentation, and presentations.

11.Project Management and Finance: Demonstrate knowledge of project management and

financial principles to manage projects effectively as an individual or team leader.

12.Lifelong Learning: Recognize the need for lifelong learning and the ability to adapt

to technological advancements and innovations.

viii
Course Objectives:

1.To give an experience to the students in solving real life practical problems with all its constraints.

2. To give an opportunity to integrate different aspects of learning with reference to real life problems.

3.To enhance the confidence of the students while communicating with industry engineers

4. Give an opportunity for useful interaction with industry engineers.

5. Familiarize with work culture and ethics of the industry.

Course Outcomes:

On completion of this course, the student will be able to-

1. Able to design/develop a small and simple product in hardware or software.

2. Able to complete the task or realize a pre-specified target, with limited scope, rather than taking up a

complex task and leave it.

3. Able to learn to find alternate viable solutions for a given problem and evaluate these alternatives with

reference to pre-specified criteria.

4. Able to implement the selected solution

5. Able to document the implemented solution.

ix
ABSTRACT

The AI/ML internship at Google is a super-strict platform designed to allow the


participants to learn in depth about artificial intelligence and machine learning including but

not limited to use of TensorFlow. Interns undertake different areas of work, including the

theory and practice course that are meant to promote practical application of the knowledge

gained.

Efficiency of the course will be presented by unlocking badges through the Google

Developer Profile successively which means earning proficiency in essential skills like object

detection, image classification and product image search.

By working in real-time with TensorFlow over the course of the internship, the

participant will have the capability of designing and deploying machine learning models

without a problem. They also resort to Google Collab, an online platform that provides an

environment for scientific computing and for writing scientific documents (which is called

Jupyter notebook). This makes experimentation and model development more

flexible.

Kapotasteria program structure involves having mentorship meetings, group projects, and

code reviews together with other interns, aiming at a balanced learning experience and skill

building.

These skills are put to good use in the real world during the legal application of AI/ML.

During code reviews and presentations, communication and problem-solving skills are further

seen in action, thus preparing students for real-world workplace scenarios. Technology

x
employed in the program accurately reflects the industry standards and the emphasis on team

learning enables participants to be trained with skills and knowledge that are highly valued in

the modern AI/ML. At the end of the internship, trainers, with strong knowledge of the basic

approaches to Artificial Intelligence/Machine Learning, obtain a practical toolkit that helps

them to solve these tasks. It is through the internship program that trainees are well equipped

to handle opportunities in future in the upcoming digital space.

xi
Table of Contents

Report of the Summer Internship ................................................................................................ i

Certificate ...................................................................................................................................ii

Declaration ............................................................................................................................... iii

Acknowledgement .................................................................................................................... iv

Vision & Mission ....................................................................................................................... v

Program Educational Objectives (PEOs) ................................................................................... v

Program Outcomes (PO’s) .......................................................................................................vii

Course Objectives: .................................................................................................................... ix

Course Outcomes: ..................................................................................................................... ix

Abstract ...................................................................................................................................... x

Table of Contents ..................................................................................................................... 12

List of Figures .......................................................................................................................... 14

1.INTRODUCTION ................................................................................................................ 14

2.Program neural networks with TensorFlow .......................................................................... 17

2.1 Introduction to Computer Vision: .................................................................................. 17

2.2 Introduction to Convolutions: ........................................................................................ 17

2.3 Convolutional Neural Networks (CNNs): ..................................................................... 17

2.4 Using CNNs with Larger Datasets: ................................................................................ 18

3.Get started with object detection........................................................................................... 19

3.1 Add ML Kit Object Detection and Tracking API to the project .................................... 19

12
3.2 Add on-device object detection ...................................................................................... 21

3.3 Set up and run on-device object detection on an image................................................. 21

3.4 Post-processing the detection results ............................................................................. 22

3.5 Understand the visualization utilities ............................................................................. 23

3.6 Visualize the ML Kit detection result ............................................................................ 23

4. Go further with object detection .......................................................................................... 25

4.1 Object Detection ............................................................................................................ 25

4.2 TensorFlow Lite ............................................................................................................. 25

5.Get started with product image search .................................................................................. 30

5.1 Detect objects in images to build a visual product search with ML Kit:Android .......... 30

5.1.1 Import the app into Android Studio ............................................................................ 30

5.1.2 Add the dependencies for ML Kit Object Detection and Tracking............................. 30

5.2 Add on-device object detection ...................................................................................... 31

5.3 Understand the visualization utilities ............................................................................. 32

6.Go further with product image search .................................................................................. 34

6.1 Call Vision API Product Search backend on Android.................................................... 34

6.2 About Vision API Product Search .................................................................................. 34

6.3 Handle object selection .................................................................................................. 35

6.4 Implement the API client ............................................................................................... 39

6.5 Implement the API client class ...................................................................................... 39

6.6 Explore the API request and response format ................................................................ 40

6.7 Get the product reference images .................................................................................. 40


13
7.Go further with image classification ..................................................................................... 43

7.1 Install and import dependencies..................................................................................... 43

7.2 Download and Prepare your Data .................................................................................. 43

7.3 Create the Image Classifier Model ................................................................................ 44

7.4 Export the Model ........................................................................................................... 44

7.5 Update your Code for the Custom Model ...................................................................... 46

CONCLUSION ........................................................................................................................ 48

LIST OF FIGURES

Fig 3.1 Interface of the object .................................................................................................. 20

Fig 3.4 Result of detection result ............................................................................................. 23

Fig 3.5 Detection result of a photo........................................................................................... 24

Fig 4.1 Output of the detection result ...................................................................................... 27

Fig 4.2 Accuracy of the predicted items .................................................................................. 29

Fig 5.1 Interface of the product image search app ................................................................... 33

Fig 6.1 Interface of the product image search app ................................................................... 36

Fig 6.2 Products images from the catalog ................................................................................ 38

Fig 6.3 Interface of product image search app after connecting the two APIs ........................ 42

Fig 7.1 Classification of an image ........................................................................................... 45

Fig 7.2 Showing accuracy of the classification of an image. ................................................... 47

1.INTRODUCTION

14
Today's world needs understanding and handling of visual information in almost

every field, for instance: self-driving technology, e-health, e-commerce among others.

Computer vision, a branch of artificial intelligence, involves the capacity of computers to

process visual information and has a number of applications that improve user and business

processes.

In this project we will focus on the application of TensorFlow in solving

important computer vision problems namely: object detection, image classification and

product search. TensorFlow is a broad open-source library created by Google that allows the

easy integration of advanced solutions into existing architectures.

Self-driving cars and the pattern recognition technologies require object

detection whereas image classification is useful in enhancing medical images as well as

content security. Besides that, visual product search has also gained importance especially in

online shopping where images are used to source for products instead of written text.

Within the scope of this project, we will consider the methods, algorithms,

datasets, and other relevant components related to the aforementioned tasks. We

will demonstrate the application of Tensor Flow in building effective and precise vision

systems that are implemented in practice.

This internship program is designed to equip participants with foundational and

advanced knowledge in AI and ML, focusing on real-world applications using Google’s

15
industry-leading tools and platforms. Interns engage in hands-on projects covering key topics

such as supervised and unsupervised learning, neural networks, natural language processing,

and cloud-based AI solutions. Through a comprehensive curriculum, participants not only

gain technical expertise but also develop problem-solving and analytical skills necessary to

build and deploy AI models effectively.

The Google AI/ML Virtual Internship aims to nurture a new generation of

AI/ML professionals by providing in-depth insights into the industry and fostering a deep

understanding of AI’s transformative potential.

16
2.Program neural networks with TensorFlow

2.1 Introduction to Computer Vision:

This field focuses on teaching machines to understand and interpret visual data

(images, videos). It allows computers to mimic human vision tasks like

recognizing objects, detecting motion, and understanding scenes. Applications include

facial recognition, self-driving cars, medical imaging, and more.

2.2 Introduction to Convolutions:

Convolution is a mathematical operation essential in image processing. It

involves applying filters to images, which helps in detecting patterns such as edges, textures,

and corners. These filters are small matrices that move across the image, capturing essential

features.

2.3 Convolutional Neural Networks (CNNs):

CNNs are deep learning models designed specifically for processing and analysing visual

data. CNNs are composed of layers:

● Convolutional layers- extract features from input images by applying filters

(kernels).

● Pooling layers-reduce the dimensionality of the image data, retaining the most

important information.

● Fully connected layers-connect the final features to the output, making predictions

like classification.

17
CNNs are widely used in image recognition tasks because they efficiently capture

hierarchical patterns in images—starting with simple edges and progressing to complex

structures.

2.4 Using CNNs with Larger Datasets:

When working with large datasets, CNNs show exceptional performance but

require a lot of computational power. Key techniques for managing large datasets include:

● Data augmentation: This involves modifying images (e.g., rotating, flipping,

or scaling) to artificially expand the dataset without collecting more data.

● Transfer learning: Involves using pre-trained CNNs and fine-tuning them for

specific tasks. This saves time and computational resources while leveraging the

knowledge from previously trained models.

By employing CNNs and these techniques, developers can handle complex, large-scale visual

data, building systems that achieve high accuracy in image classification, object detection,

and more.

18
3.Get started with object detection

ML Kit is a mobile SDK that brings Google's on-device machine learning

expertise to Android and iOS apps. You can use the powerful yet simple to use Vision and

Natural Language APIs to solve common challenges in your apps or create brand-new user

experiences. All are powered by Google's best-in-class ML models and offered to you at no

cost.

ML Kit's APIs all run on-device, allowing for real-time use cases where you

want to process alive camera stream, for example. This also means that the functionality is

available offline. This codelab will walk you through simple steps to add Object Detection

and Tracking (ODT) for a given image into your existing Android app. Please note that

this codelab takes some shortcuts to highlight ML Kit ODT usage.

3.1 Add ML Kit Object Detection and Tracking API to the project

First, there is a Button () at the bottom to:

●bring up the camera app integrated in your device/emulator

●take a photo inside your camera app

●receive the captured image in starter app

Try out the Take photo button, follow the prompts to take a photo, accept the photo and

observe it displayed inside the starter app.

19
Fig 3.1 interface of the object fig 3.2 capturing photo in fig 3.3 captured

image detection mobile app starter app

20
3.2 Add on-device object detection

In this step, you will add the functionality to the starter app to detect objects in

images. As you saw in the previous step, the starter app contains boilerplate code to take

photos with the camera app on the device. There are also 3 preset images in the app that you

can try object detection on if you are running the codelab on an Android emulator.

When you have selected an image, either from the preset images or taking a

photo with the camera app, the boilerplate code decodes that image into a Bitmap instance,

shows it on the screen and calls the runObjectDetection method with the image.

3.3 Set up and run on-device object detection on an image

There are only 3 simple steps with 3 APIs to set up ML Kit ODT:

●prepare an image: InputImage

●create a detector object: ObjectDetection.getClient(options)

●connect the 2 objects above: process(image)

Step 1: Create an InputImage

Step 2: Create a detector instance

21
ML Kit follows Builder Design Pattern. You will pass the configuration to the builder, then

acquire a detector from it. There are 3 options to configure (the options in bold are used in

this codelab):

●detector mode (single image or stream)

●detection mode (single or multiple object detection)

●classification mode (on or off)

Step 3: Feed image(s) to the detector

Object detection and classification is async processing:

●You send an image to the detector (via process()).

●The Detector reports the result back to you via a callback.

3.4 Post-processing the detection results

In this section, you'll make use of the result into the image:

●draw the bounding box on image

●draw the category name and confidence inside bounding box

22
3.5 Understand the visualization utilities

●fun drawDetectionResult(bitmap: Bitmap, detectionResults: List<BoxWithText>):

Bitmap This method draws the object detection results in detectionResults on the

input bitmap and returns the modified copy of it.

Here is an example of an output of the drawDetectionResult utility method:

fig 3.4 result of detection result

3.6 Visualize the ML Kit detection result

Use the visualization utilities to draw the ML Kit object detection result on top of the

input image. Once the app loads, press the Button with the camera icon, point your camera to

an object, take a photo, accept the photo (in Camera App) or you can easily tap any preset

images. You should see the detection results; press the Button again or select another image

to repeat a couple of times to experience the latest ML Kit ODT!

23
fig 3.5 detection result of a photo

24
4. Go further with object detection

In this unit, you'll learn how to train a custom object detection model using a set of

training images with TFLite Model Maker, then deploy your model to an Android app using

TFLite TaskLibrary. You will:

●Build an Android app that detects ingredients in images of meals.

●Integrate a TFLite pre-trained object detection model and see the limit of what the

model can detect.

●Train a custom object detection model to detect the ingredients/components of a

meal using a custom dataset called salad and TFLite Model Maker.

●Deploy the custom model to the Android app using TFLite Task Library.

4.1 Object Detection

Object detection is a set of computer vision tasks that can detect and locate objects in

a digital image. Given an image or a video stream, an object detection model can identify

which of a known set of objects might be present, and provide information about their

positions within the image.

TensorFlow provides pre-trained, mobile optimized models that can detect common

objects, such as cars, oranges, etc. You can integrate these pre-trained models in your mobile

app with just a few lines of code. However, you may want or need to detect objects in more

distinctive or offbeat categories. That requires collecting your own training images,

then training and deploying your own object detection model.

4.2 TensorFlow Lite

TensorFlow Lite is a cross-platform machine learning library that is optimized for

running machine learning models on edge devices, including Android and iOS mobile

25
devices. TensorFlow Lite is actually the core engine used inside ML Kit to run

machine learning models. There are two components in the TensorFlow Lite ecosystem

that make it easy to train and deploy machine learning models on mobile devices:

●Model Maker is a Python library that makes it easy to train TensorFlow Lite models

using your own data with just a few lines of code, no machine learning expertise required.

●Task Library is a cross-platform library that makes it easy to deploy TensorFlow

Lite models with just a few lines of code in your mobile apps.

●fun drawDetectionResult(bitmap: Bitmap, detectionResults:

List<DetectionResult>): Bitmap This method draws the object detection results in

detectionResults on the input bitmap.

26
Here is an example of an output of the drawDetectionResult utility method.

fig 4.1 output of the detection result

The TFLite Task Library makes it easy to integrate mobile-optimized machine learning

models into a mobile app. It supports many popular machine learning use cases, including

object detection, image classification, and text classification. You can load the TFLite model

and run it with just a few lines of code.

The starter app is the minimal Android application that:

-Uses either the device camera or available preset images.

- Now contains methods for taking pictures and presenting object detection output.

You will add functionality for object detection within the application by filling out

the method runObjectDetection()

The functions are defined as follows:

runObjectDetection(bitmap: Bitmap): It is a function that conducts object detection on an

input image.

27
It uses the object detection algorithm.

Add a Pre-trained Object Detection Model

● Download the Model. The pre-trained TFLite model is EfficientDet-Lite. This

model is designed to be mobile efficient, and it's trained on the COCO 2017 data set.

● Add dependencies

● Configure and Perform Object Detection

● Rendering the Detectors Results

● Train a Custom Object Detection Model.

● You will train a custom model to detect meal ingredients using TFLite Model

Maker and Google Colab. The dataset is composed of some labelled images of ingredients

like cheese and baked products.

28
Fig 4.2 accuracy of the predicted items

Developed an Android application that can detect objects in images, first by a TFLite

pretrained model, then train and deploy the learnt object detection model. You have utilized

TFLite Model Maker for model training and TFLite Task Library for its integration into the

application.

29
5.Get started with product image search

5.1 Detect objects in images to build a visual product search with ML Kit:Android

Have you seen the Google Lens demo, where you can point your phone camera at an

object and find where you can buy it online? If you want to learn how you can add the same

feature to your app, then this codelab is for you. It is part of a learning pathway that teaches

you how to build a product image search feature into a mobile app.

In this codelab, you will learn the first step to build a product image search feature:

how to detect objects in images and let the user choose the objects they want to search for.

You will use ML KitObject Detection and Tracking to build this feature.

5.1.1 Import the app into Android Studio

Start by importing the starter app into the Android Studio.

Go to Android Studio, select Import Project (Gradle, Eclipse ADT, etc.) and choose the

starter folder from the source code that you have downloaded earlier.

5.1.2 Add the dependencies for ML Kit Object Detection and Tracking

The ML Kit dependencies allow you to integrate the ML Kit ODT SDK in your app.Go to the

app/build.gradle file of your project and confirm that the dependency is already there:

build.gradle

30
5.2 Add on-device object detection

In this step, you'll add the functionality to the starter app to detect objects in images. As you

saw in the previous step, the starter app contains boilerplate code to take photos with the

camera app on the device. There are also 3 preset images in the app that you can try object

detection on, if you are running the codelab on an Android emulator.

When you select an image, either from the preset images or by taking a photo with the

camera app, the boilerplate code decodes that image into a Bitmap instance, shows it on the

screen and calls therunObjectDetection method with the image.

In this step, you will add code to the runObjectDetection method to do object detection!

Step 1: Create an InputImage

Step 2: Create a detector instance

Step 3: Feed image(s) to the detector

Object detection and classification is async processing:

●you send an image to detector (via process())

●detector reports the result back to you via a callback.

31
Upon completion, detector notifies you with

1. Total number of objects detected

2. Each detected object is described with

●trackingId: an integer you use to track it cross frames (NOT used in this codelab)

●boundingBox: object's bounding box

●labels: list of label(s) for the detected object (only when classification is enabled)

●text (Get the text of this label including "Fashion Goods", "Food", "Home

Goods", "Place", "Plant")

5.3 Understand the visualization utilities

There is some boilerplate code inside the codelab to help you visualize the detection result.

Leverage these utilities to make our visualization code simple:

●fun drawDetectionResults(results: List<DetectedObject>) This method draws white

circles at thecenter of each object detected.

●fun setOnObjectClickListener(listener: ((objectImage: Bitmap) -> Unit)) This is

a callback to receive the cropped image that contains only the object that the user has

tapped on. You will send this cropped image to the image search backend in a later codelab

to get a visually similar result. In this codelab, you won't use this method yet

32
fig 5.1 interface of the product image search app

33
6.Go further with product image search

6.1 Call Vision API Product Search backend on Android

Have you seen the Google Lens demo, where you can point your phone camera to an

object and find where you can buy it online? If you want to learn how you can add the same

feature to your app, then this codelab is for you. It is part of a learning pathway that teaches

you how to build a product image search feature into a mobile app.

In this codelab, you will learn how to call a backend built with Vision API Product

Search from a mobile app. This backend can take a query image and search for visually

similar products from a product catalog.

6.2 About Vision API Product Search

Vision API Product Search is a feature in Google Cloud that allows users to search for

visually similar products from a product catalog. Retailers can create products, each

containing reference images that visually describe the product from a set of viewpoints. You

can then add these products to product sets (i.e. product catalog). Currently Vision

API Product Search supports the following product categories: home goods, apparel,

toys, packaged goods, and general.

When users query the product set with their own images, Vision API Product Search applies

machine learning to compare the product in the user's query image with the images in the
34
retailer's product set, and then returns a ranked list of visually and semantically similar

results.

6.3 Handle object selection

6.3.1 Allow users to tap on a detected object to select

Now you'll add code to allow users to select an object from the image and start

the product search. The starter app already has the capability to detect objects in the

image. It's possible that there are multiple objects in the image, or the detected object

only occupies a small portion of the image. Therefore, you need to have the user tap

on one of the detected objects to indicate which object they want to use for product

search.

The view that displays the image in the main activity

(ObjectDetectorActivity) is actually a custom view (ImageClickableView) that

extends Android OS's default ImageView. It implements some convenient utility

methods, including:

●fun setOnObjectClickListener(listener: ((objectImage: Bitmap) -> Unit))

This is a callback to receive the cropped image that contains only the object that

the user has tapped on. You will send this cropped image to the product search

backend.

35
fig 6.1 interface of the product image search app

The onObjectClickListener is called whenever the user taps on any of the detected objects on

the screen.

It receives the cropped image that contains only the selected object.

The code snippet does 3 things:

●Takes the cropped image and serializes it to a PNG file.

●Starts the ProductSearchActivity to execute the product search sequence.

●Includes the cropped image URI in the start-activity intent so that

ProductSearchActivity can retrieve it later to use as the query image.

There are a few things to keep in mind:

●The logic for detecting objects and querying the backend has been split into 2

activities only to make the codelab easier to understand. It's up to you to decide how to

implement them in your app.

36
●You need to write the query image into a file and pass the image URI between

activities because the query image can be larger than the 1MB size limit of an Android intent.

●You can store the query image in PNG because it's a lossless format.

6.3.2 Explore the product search backend

Build the product image search backend

This codelab requires a product search backend built with Vision API Product Search.

There are two options to achieve this:

Option 1: Use the demo backend that has been deployed for you

Option 2: Create your own backend by following the Vision API Product Search

quickstart

37
You will come across these concepts when interacting with the product search backend:

●Product Set: A product set is a simple container for a group of products. A product

catalog can be represented as a product set and its products.

●Product: After you have created a product set, you can create products and

add them to the product set.

●Product's Reference Images: They are images containing various views of your

products. Reference images are used to search for visually similar products.

●Search for products: Once you have created your product set and the product

set has been indexed, you can query the product set using the Cloud Vision API.

6.3.3 Understand the preset product catalog

The product search demo backend used in this codelab was created using the Vision

API Product Search and a product catalog of about a hundred shoes and dress images. Here

are some images from the catalog:

fig 6.2 products images from the catalog

6.3.4 Call the product search demo backend

38
You can call the Vision API Product Search directly from a mobile app by setting up a

GoogleCloud API key and restricting access to the API key to just your app.

To keep this codelab simple, a proxy endpoint has been set up that allows you to

access the demo backend without worrying about the API key and authentication. It receives

the HTTP request from the mobile app, appends the API key, and forwards the request to the

Vision API Product Search backend. Then the proxy receives the response from the backend

and returns it to the mobile app.

6.4 Implement the API client

6.4.1 Understand the product search workflow

Follow this workflow to conduct product search with the backend.

●Encode the query image as a base64 string

●Call the projects.locations.images.annotate endpoint with the query image

●Receive the product image IDs from the previous API call and send them to

theprojects.locations.products.referenceImages.get endpoints to get the URIs of the product

images in the search result.

6.5 Implement the API client class

Now you'll implement code to call the product search backend in a dedicated

class called ProductSearchAPIClient. Some boilerplate code has been implemented for you

in the starter app:

●class ProductSearchAPIClient: This class is mostly empty now but it has some

methods that you will implement later in this codelab.

●fun convertBitmapToBase64(bitmap: Bitmap): Convert a Bitmap instance into

its base64 representation to send to the product search backend


39
●fun annotateImage(image: Bitmap): Task<List<ProductSearchResult>>: Call

theprojects.locations.images.annotate API and parse the response.

●fun fetchReferenceImage(searchResult: ProductSearchResult):

Task<ProductSearchResult>:Call the projects.locations.products.referenceImages.get API and

parse the response.

●SearchResult.kt: This file contains several data classes to represent the types

returned by theVision API Product Search backend.

6.6 Explore the API request and response format

You can find similar products to a given image by passing the image's Google

Cloud Storage URI, web URL, or base64 encoded string to Vision API Product Search.

Here are some important fields in the product search result object:

●product.name: The unique identifier of a product in the format of projects/{project-

id}/locations/{location-id}/products/{product_id}

●product.score: A value indicating how similar the search result is to the query

image. Highervalues mean more similarity.

●product.image: The unique identifier of the reference image of a product in

the format of projects/{project-id}/locations/{location

id}/products/{product_id}/referenceImages/{image_id}.You will need to send another API

request to projects.locations.products.referenceImages.get to get the URL of this reference

image so that it will display on the screen.

6.7 Get the product reference images

Explore the API request and response format

40
You'll send a GET HTTP request with an empty request body to

theprojects.locations.products.referenceImages.get endpoint to get the URIs of the product

images returned by the product search endpoint.

The reference images of the demo product search backend was set up to

have public-read permission. Therefore, you can easily convert the GCS URI to an HTTP

URL and display it on the app UI. You only need to replace the gs:// prefix with

https://storage.googleapis.com/.

6.8 Implement the API call

Next, craft a product search API request and send it to the backend. You'll use Volley

and Task API similarly to the product search API call.

6.9 Connect the two API requests

Go back to annotateImage and modify it to get all the reference images' HTTP URLs

before returning the ProductSearchResult list to its caller.

Once the app loads, tap any preset images, select an detected object, tap the Search

button to see the search results, this time with the product images.

41
Fig 6.3 interface of product image search app after connecting the two APIs

42
7.Go further with image classification

In the previous codelab you created an app for Android and iOS that used a basic

image labelling model that recognizes several hundred classes of image. It

recognized a picture of a flower very generically – seeing petals, flower, plant, and

sky.

To update the app to recognize specific flowers, daisies or roses for example, you'll

need a custom model that's trained on lots of examples of each of the type of flower you want

to recognize.

This codelab will not go into the specifics of how a model is built. Instead, you'll

learn about the APIs from TensorFlow Lite Model Maker that make it easy.

7.1 Install and import dependencies

Install TensorFlow Lite Model Maker. You can do this with a pip install. The &>

/dev/null at the end just suppresses the output. Model Maker outputs a lot of stuff that isn't

immediately relevant. It's been suppressed so you can focus on the task at hand.

7.2 Download and Prepare your Data

If your images are organized into folders, and those folders are zipped up, then if you

download the zip and decompress it, you'll automatically get your images labeled based on

the folder they're in. This directory will be referenced as data_path.

This data path can then be loaded into a neural network model for training with

TensorFlow LiteModel Maker's ImageClassifierDataLoader class. Just point it at the folder

and you're good to go. One important element in training models with machine learning is to

43
not use all of your data for training. Hold back a little to test the model with data it hasn't

previously seen. This is easy to do with the split method of the dataset that comes back from

ImageClassifierDataLoader.

7.3 Create the Image Classifier Model

Model Maker abstracts a lot of the specifics of designing the neural network so you

don't have to deal with network design, and things like convolutions, dense, relu, flatten, loss

functions and optimizers.

The model went through 5 epochs – where an epoch is a full cycle of training where

the neural network tries to match the images to their labels. By the time it went

through 5 epochs, in around 1minute, it was 93.85% accurate on the training data. Given

that there's 5 classes, a random guess would be 20% accurate, so that's progress!

7.4 Export the Model

Now that the model is trained, the next step is to export it in the .tflite format that a

mobile application can use. Model maker provides an easy export method that you can use —

simply specify the directory to output to.

44
fig 7.1 classification of an image

For the rest of this lab, I'll be running the app in the iPhone

simulator which should support the build targets from the codelab.

If you want to use your own device, you might need to change the

build target in your project settings to match your iOS version.

Run it and you'll see something like this:

Note the very generic classifications – petal, flower, sky.

The model you created in the previous codelab was trained to detect5 varieties of flower,

including this one – a daisy.

45
For the rest of this codelab, you'll look at what it will take to

upgrade your app with the custom model.

7.5 Update your Code for the Custom Model

1.Open your ViewController.swift file. You may see an error on the ‘import

MLKitImageLabeling' at the top of the file. This is because you removed the generic image

labelling libraries when you updatedyour pod file.

import MLKitVision

import MLKit

import MLKitImageLabelingCommon

import MLKitImageLabelingCustom

It might be easy to speed read these and think that they're repeating the same code! But it's

"Common" and "Custom" at the end!

2.Next you'll load the custom model that you added in the previous step. Find the

getLabels() func.Beneath the line that reads visionImage.orientation =

image.imageOrientation, add these lines:

3.Find the code for specifying the options for the generic ImageLabeler. It's probably

giving you an error since those libraries were removed:let options = ImageLabelerOptions()

Replace that with this code, to use a CustomImageLabelerOptions, and which specifies the

local model: let options = CustomImageLabelerOptions(localModel: localModel)

...and that's it! Try running your app now! When you try to classify the image it should be

more accurate – and tell you that you're looking at a daisy with high probability!

46
fig 7.2 showing accuracy of the classification of an image.

47
CONCLUSION

In conclusion, I am proud to have successfully completed the virtual internship as a

GoogleAI/ML Intern, which has been an incredibly enriching experience in my

professional development. Throughout this internship, I engaged with a series of

comprehensive courses and hands-on projects that provided me with a strong foundation in

artificial intelligence and machine learning.

Starting with the AI Foundations course, I gained essential knowledge about the

fundamental concepts of AI and ML. This course covered critical topics such as supervised

and unsupervised learning, neural networks, and deep learning algorithms. Understanding

these key concepts has been instrumental in shaping my perspective on the growing impact of

AI and its applications in various industries.

Building on this foundation, I progressed to the Applied Machine Learning course,

which offered deeper insights into deploying machine learning models in real-world

scenarios. This course provided practical exposure to training models, fine-tuning

hyperparameters, and evaluating model performance using Google’s AI tools and

frameworks. The hands-on experience from this course equipped me with the ability to build

and optimize models to solve real-world problems effectively.

Finally, I completed a capstone project focused on using Google Cloud AI and ML

tools, where I implemented a machine learning solution to a real business problem. This

project allowed me to apply everything I learned, from data preprocessing and model

building to deployment. This practical experience has prepared me for real-world

challenges where I can apply AI and ML to drive meaningful results.

Overall, this virtual internship has not only expanded my technical knowledge but

also solidified my passion for pursuing a career in artificial intelligence and machine

48
learning. The combination of theoretical learning and practical application through

these courses has significantly enriched my understanding of AI and ML. I am excited

to leverage this knowledge as I continue to explore the field and contribute to innovative

solutions in the AI/ML domain.

49

You might also like