How To Classify Images With Tensorflow Using Google Cloud Machine Learning and Cloud Dataflow
How To Classify Images With Tensorflow Using Google Cloud Machine Learning and Cloud Dataflow
Blog What's new Product news Solutions & technologies Topics Transformation Leaders
Google Cloud
Slaven Bilac
Software Engineer
For specialized image-classi!cation use cases, using Google Cloud Data"ow and
Google Cloud Machine Learning makes it easy to train and implement machine-
learning models.
Google Cloud Vision API is a popular service that allows users to classify images into
categories, appropriate for multiple common use cases across several industries. For
those users whose category requirements map to the pre-built, pre-trained machine-
learning model re!ected in the API, this approach is ideal. However, other users have
more specialized requirements — for example, to identify speci"c products and so#
goods in mobile-phone photos, or to detect nuanced di$erences between pa%icular
animal species in wildlife photography. For them, it can be more e&cient to train and
serve a new image model using Google Cloud Machine Learning (Cloud ML), the
managed service for building and running machine-learning models at scale using the
open source TensorFlow deep-learning framework.
In this post, we'll build a simple model in Cloud ML using a small set of labeled !ower
images. This dataset has been selected for ease of explanation only; we've successfully
used the same implementation for several proprietary datasets covering cases like
interior-design classi"cation (e.g., carpet vs. hardwood !oor) and animated-character
classi"cation. The code can be found here and can easily be adapted to run on di$erent
datasets.
Sun!owers by Liz West is licensed under CC BY 2.0
In addition to image "les, we've provided a CSV "le (all_data.csv) containing the image
URIs and labels. We randomly split this data into two "les, train_set.csv and eval_set.csv,
with 90% data for training and 10% for eval, respectively.
gs://cloud-ml-
data/img/flower_photos/dandelion/17388674711_6dca8a2e8b_n.jpg,dandel
ion
gs://cloud-ml-
data/img/flower_photos/sunflowers/9555824387_32b151e9b0_m.jpg,sunflo
wers
gs://cloud-ml-
data/img/flower_photos/daisy/14523675369_97c31d0b5b.jpg,daisy
gs://cloud-ml-
data/img/flower_photos/roses/512578026_f6e6f2ad26.jpg,roses
gs://cloud-ml-
data/img/flower_photos/tulips/497305666_b5d4348826_n.jpg,tulips...
We also need a text "le containing all the labels (dict.txt), which is used to sequentially
map labels to internally used IDs. In this case, daisy would become ID 0 and
tulips would become 4 . If the label isn't in the "le, it will be ignored from
preprocessing and training.
daisy
dandelion
roses
sunflowers
tulips
Because we only have a small set of images in this sample dataset, it would be hard to
build an accurate classi"cation model from scratch. Instead, we'll use an approach
called transfer learning, taking a pre-trained model called Inception, and using it to
extract image features that we'll then use to train a new classi"er.
Before following along below, be sure to set up your project and environment "rst.
Preprocessing
We sta% with a set of labeled images in a Google Cloud Storage bucket and preprocess
them to extract the image features from the bo'leneck layer (typically the penultimate
layer) of the Inception network. Although processing images in this manner can be
reasonably expensive, each image can be processed independently and in parallel,
making this task a great candidate for Cloud Data!ow.
The !owers dataset is about 3,600 images in size and can be processed on a single
machine. For larger sets, though, the preprocessing becomes the bo'leneck and
parallelization can lead to a huge increase in throughput. To measure the bene"t of
parallelizing preprocessing on Google Cloud, we ran the above preprocessing on 1
million sample images from the Open Image Dataset. We found that while it takes
several days to preprocess 1 million images locally, it takes less than 2 hours on the
cloud when we use 100 workers with four cores each!
Modeling
Once we've preprocessed data, we can then train a simple classi"er. The network will
comprise a single fully-connected layer with RELU activations and with one output for
each label in the dictionary to replace the original output layer. Final output is computed
using the so#max function. Note that in training stages we're using the dropout
technique, which randomly ignores a subset of input weights to prevent over-"'ing to
the training dataset.
Training
The training can be run using the following command:
# Submit training job.
gcloud ml-engine jobs submit training "$JOB_ID" \
--module-name trainer.task \
--package-path trainer \
--staging-bucket "$BUCKET" \
--region us-central1 \
-- \
--output_path "${GCS_PATH}/training" \
--eval_data_paths "${GCS_PATH}/preproc/eval*" \
--train_data_paths "${GCS_PATH}/preproc/train*"
# Monitor training logs.
gcloud ml-engine jobs stream-logs "$JOB_ID"
We can monitor progress of the training using tensorboard. As you can see from the
image on the right, for the !owers dataset we reach accuracy of ~90% on our eval set
Prediction
For prediction, we don't want to separate the image preprocessing and inference into
two separate steps because we need to pe(orm both in sequence for every image.
Instead, we create a single TensorFlow graph that produces the image embedding and
does the classi"cation using the trained model in one step.
The format of the request is JSON with image content encoded using base64 encoding.
# Copy the image to local disk.
gsutil cp gs://cloud-ml-
data/img/flower_photos/tulips/4520577328_a94c11e806_n.jpg flower.jpg
# Create request message in json format.
python -c 'import base64, sys, json; img =
base64.b64encode(open(sys.argv[1], "rb").read()); print
json.dumps({"key":"0", "image_bytes": {"b64": img}})' flower.jpg &>
request.json
# Call prediction service API to get classifications
gcloud ml-engine predict --model ${MODEL_NAME} --json-instances
request.json
Alternatively, you can use a command-line tool to encode multiple images into a single
JSON request.
Using an example image from the eval set, we get the following response when running
the above predict command:
•predictions:
- key: '0'
prediction: 4
scores:
- 8.11998e-09
- 2.64907e-08
- 1.10307e-06
- 3.69488e-11
- 0.999999
- 3.35913e-09
It correctly identi"es the most likely category as 4 , which is the ID of tulip , with
score of 0.99 . Very nice!
Next steps
In sho%, if you have relatively specialized image-classi"cation category requirements
not re!ected in Cloud Vision API, using Cloud Data!ow and Cloud ML can make it easy
to train classi"cation models using labeled images and deploy them for online
classi"cation.
If you have any feedback on the design or documentation of this example or have other
issues, please repo% it on GitHub; pull request and contributions are also welcome.
We're always working to improve and simplify our products, so stay tuned for new
improvements in the near future!
See also:
Cloud Data!ow documentation
Cloud ML documentation
Cloud Vision API documentation
Related a!icles
Inside Google Cloud Data Analytics Developers & Practitioners Data Analytics
What’s new with Google Run faster and more Opinary generates Accelerate integrated
Cloud cost-e"ective Dataproc recommendations faster Salesforce insights with
By Google Cloud Content & jobs on Cloud Run Google Cloud Co!ex
Editorial • 2-minute read By Christian Yarros • 11-minute read By Doreen Sacker • 6-minute read Framework
By Danielle Brannon • 2-minute
read
Follow us