Google Aiml2
Google Aiml2
BACHELOR OF TECHNOLOGY IN
COMPUTER SCIENCE AND ENGINEERING
By
1
CERTIFICATE
This report on
“GOOGLE AI-ML VIRTUAL INTERNSHIP”
Is a bonafide record of the Internship work submitted
By
In their VII semester in partial fulfilment of the requirements for the Award of Degree of
Bachelor of Technology in
Internship Mentor
Ms. P. SRAVYA
(Assistant Professor)
2
ACKNOWLEDGMENT
We would like to express our deep sense of gratitude to our esteemed institute
Gayatri Vidya Parishad College of Engineering (Autonomous), which has provided us an
opportunity to fulfil our cherished desire.
We thank our Course coordinator, Dr. CH. SITA KUMARI, Associate Professor,
epartment of Computer Science and Engineering, for the kind suggestions and guidance
for the successful completion of our internship
We are very thank full highly indebted to our associate head of CSE department, Dr.
D. UMA DEVI Associate Professor & I/C Head of the Department of Computer Science
and Engineering, Gayatri Vidya Parishad College of Engineering (Autonomous), for
giving us an opportunity to do the internship in college.
We express our sincere thanks to our Principal Dr. A.B. KOTESWARA RAO,
Gayatri Vidya Parishad College of Engineering (Autonomous) for his encouragement to
us during this project, giving us a chance to explore and learn new technologies in the form
of internship.
We are very Thankful to AICTE, Edu-skills and Google for giving us an internship
and helping to solve every issue regarding the internship.
Finally, we are indebted to the teaching and non-teaching staff of the Computer
Science and Engineering Department for all their support in completion of our project.
B. MARY JONES
(20131A0524)
3
CERTIFICATE: 20131A0524
4
ABSTRACT
This internship report presents an in-depth exploration of my journey with Google AI ML,
where I delved into the practical application of machine learning through TensorFlow.
TensorFlow, a widely used open-source library, served as the foundation for building and
deploying various machine learning models. The program encompassed a range of modules,
from programming neural networks to object detection and image-based tasks like search and
classification. Throughout these modules, I gained hands-on experience in designing, training,
and evaluating machine learning models for real-world scenarios. This included identifying
objects within images and retrieving product images based on user queries.
The report delves into the core concepts explored in each module, highlighting the
technical skills I acquired during the internship. Understanding these concepts and mastering
the tools provided a strong foundation for building and manipulating machine learning models.
The report also explores any projects undertaken and the challenges encountered along the
way. Finally, the report concludes by summarizing the invaluable experience gained during
this internship. It emphasizes the key takeaways, particularly the acquired technical skills, and
discusses how this experience can be leveraged for future endeavors in the field of Artificial
Intelligence and Machine Learning.
The knowledge and skills acquired during this internship at Google AI ML hold significant
promise for the future. The ability to design and implement machine learning models using
TensorFlow opens doors to various applications across diverse industries. This experience
equips me to contribute to advancements in areas like computer vision, image retrieval
systems, and potentially even develop innovative solutions for problems yet to be encountered.
The internship not only fostered technical expertise but also nurtured a deeper understanding
of the potential and challenges within the ever-evolving landscape of Artificial Intelligence.
5
Table of contents
2 10
Chapter 1: Program neural networks with TensorFlow
Economic feasibility
1.1: The Hello World of Machine Learning 11
5 23
Chapter 4: Get started with product image search
6 25
Chapter 5: Go further with product image search
8 29
Case Study
9 Conclusion 32
10 References 33
6
INTRODUCTION
Artificial Intelligence:
Artificial Intelligence (AI) is a branch of computer science that aims to create
systems capable of performing tasks that typically require human intelligence. These
tasks include reasoning, learning, problem-solving, perception, and language
understanding.
Machine Learning:
Machine Learning (ML) is a subset of AI that focuses on developing
algorithms and statistical models that enable computers to learn from and make
predictions or decisions based on data. ML algorithms can be categorized into three
main types:
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning.
Deep Learning:
Deep Learning (DL) is a subfield of ML that uses neural networks with
multiple layers (hence the term "deep") to learn representations of data. DL
algorithms have shown remarkable success in various tasks such as image and
7
speech recognition, natural language processing, and autonomous driving.
TensorFlow:
TensorFlow is an open-source machine-learning framework developed by
Google Brain for building and training machine-learning models. It is widely used
in AI research and production applications due to its scalability, flexibility, and
extensive ecosystem of tools and libraries.
1. Graph-based computation
2. Automatic differentiation
3. High-level APIs
4. Scalability
5. Extensive Library ecosystem
8
engineering
6. ML can give good results on 6. DL gives best results on large
large and small data both data
9
CHAPTER-1: PROGRAM NEURAL NETWORKS WITH
TENSORFLOW
10
1.1 The Hello World of Machine Learning
Whereas ML, rather than attempting to define and express rules explicitly in a
programming language, we furnish answers, referred to as labels, alongside the data.
The machine then deduces the rules governing the relationship between the data and
the provided answers. Labels, in our context, serve as information indicating the
ongoing activity.
11
Fig 1.1.2 traditional programming method-2
12
involves dividing the entire dataset into two parts. One portion is allocated for
training, while the other is designated for testing. We undertake this division to
assess the model's performance using data, that hasn't been exposed during training.
To achieve this, we can make use of the function train_test_split ().
Subsequently, to input data into the neural network, we employ the function
keras.layers.Dense, enabling us to initialize a neural network with the preferred
number of outputs. Additionally, we can incorporate activation functions in the
output to introduce non-linearity to the data. Optimizers and loss functions serve
as the primary components of the training process. The training process iterates for
a specified number of times, referred to as epochs.
For model testing, we can utilize the evaluate function from the Keras library,
enabling us to verify the test results.
13
1.3 Introduction to Convolutions and CNN (Convolutional Neural
Networks)
Convolution serves as a fundamental operation in signal processing and image
analysis, frequently employed in deep learning for tasks such as image recognition
and feature extraction. This process entails applying a filter, also referred to as a
kernel, to an input image to generate an output feature map.
Convolution operations aid in extracting features like edges, textures, and
patterns from images through a process of sliding the filter across the input image
and performing element-wise multiplications, followed by summation.
14
Key Components of CNN include:
1. Convolutional layers
2. Pooling layers
3. Fully connected layers
4. Activation functions
15
CHAPTER-2: GET STARTED WITH OBJECT DETECTION
"Get Started with Object Detection" marks a pivotal chapter within the
machine learning landscape, particularly focusing on the realm of object detection.
This chapter serves as a comprehensive initiation into the world of object detection
using TensorFlow's Model Maker Kit (MK Kit) Object Detection API. With a
focus on practical implementation, learners are introduced to the foundational
concepts of object detection, including bounding boxes, object localization, and
model evaluation metrics.
Object detection stands as a computer vision task that entails recognizing and
pinpointing objects of interest within an image or video frame. Diverging from
image classification, which predicts the presence of an object in an image, object
detection extends further by furnishing accurate bounding boxes around detected
objects, alongside their corresponding class labels.
16
Fig 2.1.1 demonstration of Object detection
17
that facilitates the capture of images and their subsequent analysis for object
detection.
3. Camera Integration:
Learners are introduced to the integration of camera functionality within the
Android app. The code enables users to either capture a photo using the device's
camera or select a pre-existing image from the device's gallery.
18
7. Object Detection Results:
The module concludes with the presentation of results obtained from the
object detection process. Learners can observe the detected objects along with their
corresponding labels, providing insight into the effectiveness of the integration.
19
CHAPTER 3: GO FURTHER WITH OBJECT DETECTION
At the core of TensorFlow Lite lies the engine utilized within ML Kit to
execute machine learning models. The TensorFlow Lite ecosystem comprises two
key components that streamline the training and deployment of machine-learning
models on mobile devices:
TensorFlow Lite also features a pre-trained object detection model enabling object
detection tasks. It provides methods like run_object_detection(),
draw_detection_result(), and more.
20
Within the TensorFlow ecosystem, Model Maker emerges as a pivotal tool,
celebrated for its adeptness in simplifying the creation of custom machine learning
models. Among its array of functionalities, Model Maker excels in offering a
streamlined pathway for diverse tasks, including image classification, text
classification, and notably, object detection. Among these tasks, object detection
holds paramount importance, and Model Maker's dedicated Task Library tailored
precisely for this domain resonates deeply with developers and researchers alike.
21
As the training journey unfolds, Model Maker's automated processes take
the helm, navigating through epochs while iteratively refining model parameters to
perfection. Alongside, users monitor key metrics such as accuracy and loss, gaining
invaluable insights into model performance. Armed with this information, users
guide their models towards deployment, a culmination of their efforts. With Model
Maker's deployment utilities, the transition from training to inference is seamless,
ensuring that the fruits of labor are readily available for real-world applications.
22
CHAPTER 4: GET STARTED WITH PRODUCT IMAGE
SEARCH
This chapter delves into the functionality and integration of Vision API
Product Search, a pivotal component behind applications like Google Lens,
renowned for its capability to conduct product searches within images. Through the
use of machine learning algorithms, Vision API Product Search empowers
developers to analyze image content and extract pertinent product details seamlessly.
The project is initiated within Android Studio, a robust IDE for Android app
development, where integration with the ML Object Detection and Tracking API
takes place. This integration equips developers with the necessary tools to implement
object detection functionality seamlessly into their Android applications, enhancing
user experience and interactivity.
23
Within the provided codebase, developers encounter boilerplate code
facilitating image capture from either the device's camera or pre-loaded images,
streamlining the process of object detection. This functionality ensures a user-
friendly experience, enabling effortless interaction with the object detection feature.
24
CHAPTER 5: GO FURTHER WITH PRODUCT IMAGE
SEARCH
The preceding chapter of our project journey involved the loading of the
starter app within Android Studio, where we successfully detected objects present
in the provided images and conducted subsequent product searches. This initial
phase laid the groundwork for our exploration into more advanced functionalities
within the application.
The starter app exhibited the capability to detect objects within images
autonomously. However, considering scenarios where multiple objects may exist
within an image or where the detected object occupies a small portion of the overall
image, user interaction becomes essential. To address this, the app required
enhancements that enable users to tap on a detected object, thereby indicating their
selection for product search purposes.
25
Transitioning to the current chapter, we embarked on a journey to establish
our custom backend infrastructure, a crucial step towards achieving more advanced
functionalities. This process commenced with a comprehensive exploration of the
Vision API Product Search quick start guide, providing valuable insights into the
creation of bespoke backend solutions tailored to our specific requirements.
26
CHAPTER 6: GO FURTHER WITH IMAGE CLASSIFICATION
One of the standout features of Model Maker is its ability to generate default
models with remarkable ease. With just a single line of code, developers can initiate
the model creation process, leveraging the underlying neural network to undergo
training based on provided datasets. This streamlined approach significantly
expedites the model development process, enabling rapid iteration and
experimentation.
27
from the Keras library and meticulously divided into distinct training and validation
sets. This separation ensures the model's ability to generalize well to unseen data, a
critical aspect of robust machine learning model development.
Upon successful completion of the training phase, the trained model is ready
for deployment within mobile applications. Utilizing the TensorFlow Lite
framework, the trained model is exported into a file format known as .tflite,
optimized for efficient execution on resource-constrained mobile and embedded
platforms. This exportation process paves the way for seamless integration of the
custom image classifier into mobile applications, enabling real-world deployment
and utilization.
28
CASE STUDY
Object Detection:
When users launch the app and point their smartphone camera at a scene,
the app utilizes object detection to identify and outline various objects
within the camera view.
For instance, if the user points the camera at a table with different items,
the app detects and outlines individual products like bottles, fruits, and
electronics.
Upon detecting products in the scene, the app prompts users to select a
specific product they're interested in purchasing.
Once the user selects a product, the app initiates a product image search
using the selected item's image.
Leveraging the Vision API Product Search, the app sends the product
image to the backend, which returns visually similar products from the
app's product catalog.
29
Image Classification:
This classification allows the app to present more accurate and relevant
product suggestions to the user based on their preferences and browsing
history.
By tracking user behavior, such as past purchases and product views, the
app offers personalized product recommendations tailored to each user's
interests.
For example, if a user frequently searches for electronic gadgets, the app
prioritizes showing similar products in the search results.
Users can view detailed product information, reviews, and pricing from
various online stores directly within the app.
Additionally, users can add products to their shopping carts and complete
purchases without leaving the app.
30
Feedback and Continuous Improvement:
user experience.
Machine learning models are periodically updated and retrained using the
latest data to enhance accuracy and relevance in product detection, search,
and classification.
31
CONCLUSION
Overall, this internship has equipped me with a diverse skill set and a profound
appreciation for the potential of machine learning in addressing complex problems.
As I transition to the next phase of my career, I am confident that the knowledge and
experiences gained during this internship will serve as a solid foundation for future
endeavors in the dynamic field of machine learning.
32
REFERENCES
https://developers.google.com/learn/pathways/tensorflow
https://developers.google.com/learn/pathways/get-started-object-detection
https://developers.google.com/learn/pathways/going-further-object-detection
https://developers.google.com/learn/pathways/get-started-image-product-search
https://developers.google.com/learn/pathways/going-further-image-product-search
https://developers.google.com/learn/pathways/going-further-image-classification
33