0% found this document useful (0 votes)
265 views

Computer Vision

The document provides an overview of the capabilities of the Microsoft Computer Vision API, including summarizing images with tags and categories, identifying image types, detecting faces, and analyzing domain-specific content like celebrities and landmarks. It describes how to obtain keys to call the API and provides examples of responses for tasks like tagging, categorizing, and detecting faces in images. Quickstarts, tutorials and references are available for different programming languages.

Uploaded by

Mani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
265 views

Computer Vision

The document provides an overview of the capabilities of the Microsoft Computer Vision API, including summarizing images with tags and categories, identifying image types, detecting faces, and analyzing domain-specific content like celebrities and landmarks. It describes how to obtain keys to call the API and provides examples of responses for tasks like tagging, categorizing, and detecting faces in images. Quickstarts, tutorials and references are available for different programming languages.

Uploaded by

Mani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 117

Table of Contents

Computer Vision Documentation


Overview
How-to
Obtain subscription keys
Call the Computer Vision API
Analyze videos in real time
Quickstarts
cURL
C#
Java
JavaScript
PHP
Python
Ruby
Tutorials
C#
Python
Reference
API reference
SDKs
Android
Swift
Windows
Resources
Samples
Category taxonomy
FAQ
Research papers
Computer Vision Documentation
The cloud-based Computer Vision API provides developers with access to advanced algorithms for processing images and
returning information. By uploading an image or specifying an image URL, Microsoft Computer Vision algorithms can analyze
visual content in different ways based on inputs and user choices. Learn how to analyze visual content in different ways with our
quickstarts, tutorials, and samples.

5-Minute Quickstarts
Analyze images, generate thumbnails, and extract text from an image using:
C#
cURL
Java
JavaScript
PHP
Python
Ruby

Step-by-Step Tutorials
Develop applications using the Computer Vision API:
1. C# Tutorial
2. Python Tutorial

Reference
APIs
API Reference

SDKs
Android
Swift
Windows
Computer Vision API Version 1.0
6/8/2017 • 9 min to read • Edit Online

The cloud-based Computer Vision API provides developers with access to advanced algorithms for processing
images and returning information. By uploading an image or specifying an image URL, Microsoft Computer Vision
algorithms can analyze visual content in different ways based on inputs and user choices. With the Computer Vision
API users can analyze images to:
Tag images based on content.
Categorize images.
Identify the type and quality of images.
Detect human faces and return their coordinates.
Recognize domain-specific content.
Generate descriptions of the content.
Use optical character recognition to identify printed text found in images.
Recognize handwritten text.
Distinguish color schemes.
Flag adult content.
Crop photos to be used as thumbnails.

Requirements
Supported input methods: Raw image binary in the form of an application/octet stream or image URL.
Supported image formats: JPEG, PNG, GIF, BMP.
Image file size: Less than 4 MB.
Image dimension: Greater than 50 x 50 pixels.

Tagging Images
Computer Vision API returns tags based on more than 2000 recognizable objects, living beings, scenery, and
actions. When tags are ambiguous or not common knowledge, the API response provides 'hints' to clarify the
meaning of the tag in context of a known setting. Tags are not organized as a taxonomy and no inheritance
hierarchies exist. A collection of content tags forms the foundation for an image 'description' displayed as human
readable language formatted in complete sentences. Note, that at this point English is the only supported language
for image description.
After uploading an image or specifying an image URL, Computer Vision API's algorithms output tags based on the
objects, living beings, and actions identified in the image. Tagging is not limited to the main subject, such as a
person in the foreground, but also includes the setting (indoor or outdoor), furniture, tools, plants, animals,
accessories, gadgets etc.
Example
'

Returned Json
{
'tags':[
{
"name":"grass",
"confidence":0.999999761581421
},
{
"name":"outdoor",
"confidence":0.999970674514771
},
{
"name":"sky",
"confidence":999289751052856
},
{
"name":"building",
"confidence":0.996463239192963
},
{
"name":"house",
"confidence":0.992798030376434
},
{
"name":"lawn",
"confidence":0.822680294513702
},
{
"name":"green",
"confidence":0.641222536563873
},
{
"name":"residential",
"confidence":0.314032256603241
},
],
}

Categorizing Images
In addition to tagging and descriptions, Computer Vision API returns the taxonomy-based categories defined in
previous versions. These categories are organized as a taxonomy with parent/child hereditary hierarchies. All
categories are in English. They can be used alone or with our new models.
The 86-category concept
Based on a list of 86 concepts seen in the following diagram, visual features found in an image can be categorized
ranging from broad to specific. For the full taxonomy in text format, see Category Taxonomy.
IMAGE RESPONSE

people

people_crowd
IMAGE RESPONSE

animal_dog

outdoor_mountain

food_bread

Identifying Image Types


There are several ways to categorize images. Computer Vision API can set a boolean flag to indicate whether an
image is black and white or color. It can also set a flag to indicate whether an image is a line drawing or not. It can
also indicate whether an image is clip art or not and indicate its quality as such on a scale of 0-3.
Clip-art type
Detects whether an image is clip art or not.

VALUE MEANING

0 Non-clip-art

1 ambiguous

2 normal-clip-art

3 good-clip-art
IMAGE RESPONSE

3 good-clip-art

0 Non-clip-art

Line drawing type


Detects whether an image is a line drawing or not.

IMAGE RESPONSE

True

False

Faces
Detects human faces within a picture and generates the face coordinates, the rectangle for the face, gender, and age.
These visual features are a subset of metadata generated for face. For more extensive metadata generated for faces
(facial identification, pose detection, and more), use the Face API.

IMAGE RESPONSE

[ { "age": 23, "gender": "Female", "faceRectangle": { "left": 1379,


"top": 320, "width": 310, "height": 310 } } ]

[ { "age": 28, "gender": "Female", "faceRectangle": { "left": 447,


"top": 195, "width": 162, "height": 162 } }, { "age": 10, "gender":
"Male", "faceRectangle": { "left": 355, "top": 87, "width": 143,
"height": 143 } } ]

[ { "age": 11, "gender": "Male", "faceRectangle": { "left": 113,


"top": 314, "width": 222, "height": 222 } }, { "age": 11, "gender":
"Female", "faceRectangle": { "left": 1200, "top": 632, "width":
215, "height": 215 } }, { "age": 41, "gender": "Male",
"faceRectangle": { "left": 514, "top": 223, "width": 205, "height":
205 } }, { "age": 37, "gender": "Female", "faceRectangle": { "left":
1008, "top": 277, "width": 201, "height": 201 } } ]

Domain-Specific Content
In addition to tagging and top-level categorization, Computer Vision API also supports specialized (or domain-
specific) information. Specialized information can be implemented as a standalone method or with the high-level
categorization. It functions as a means to further refine the 86-category taxonomy through the addition of domain-
specific models.
Currently, the only specialized information supported are celebrity recognition and landmark recognition. They are
domain-specific refinements for the people and people group categories, and landmarks around the world.
There are two options for using the domain-specific models:
Option One - Scoped Analysis
Analyze only a chosen model, by invoking an HTTP POST call. For this option, if you know which model you want to
use, you specify the model's name, and you only get information relevant to that model. For example, you can use
this option to only look for celebrity-recognition. The response contains a list of potential matching celebrities,
accompanied by their confidence scores.
Option Two - Enhanced Analysis
Analyze to provide additional details related to categories from the 86-category taxonomy. This option is available
for use in applications where users want to get generic image analysis in addition to details from one or more
domain-specific models. When this method is invoked, the 86-category taxonomy classifier is called first. If any of
the categories match that of known/matching models, a second pass of classifier invocations follows. For example,
if 'details=all' or "details" include 'celebrities', the method calls the celebrity classifier after the 86-category classifier
is called. The result includes tags starting with 'people_'.

Generating Descriptions
Computer Vision API's algorithms analyze the content in an image. This analysis forms the foundation for a
'description' displayed as human-readable language in complete sentences. The description summarizes what is
found in the image. Computer Vision API's algorithms generate various descriptions based on the objects identified
in the image. The descriptions are each evaluated and a confidence score generated. A list is then returned ordered
from highest confidence score to lowest. An example of a bot that uses this technology to generate image captions
can be found here.
Example Description Generation

'
Returned Json

'description':{
"captions":[
{
"type":"phrase",
'text':'a black and white photo of a large city',
'confidence':0.607638706850331
}
]
"captions":[
{
"type":"phrase",
'text':'a photo of a large city',
'confidence':0.577256764264197
}
]
"captions":[
{
"type":"phrase",
'text':'a black and white photo of a city',
'confidence':0.538493271791207
}
]
'description':[
"tags":{
"outdoor",
"city",
"building",
"photo",
"large",
}
]
}

Perceiving Color Schemes


The Computer Vision algorithm extracts colors from an image. The colors are analyzed in three different contexts:
foreground, background, and whole. They are grouped into twelve 12 dominant accent colors. Those accent colors
are black, blue, brown, gray, green, orange, pink, purple, red, teal, white, and yellow. Depending on the colors in an
image, simple black and white or accent colors may be returned in hexadecimal color codes.

IMAGE FOREGROUND BACKGROUND COLORS

Black Black White

Black White White, Black, Green


IMAGE FOREGROUND BACKGROUND COLORS

Black Black Black

Accent color
Color extracted from an image designed to represent the most eye-popping color to users via a mix of dominant
colors and saturation.

IMAGE RESPONSE

#BC6F0F

#CAA501

#484B83

Black & White


Boolean flag that indicates whether an image is black&white or not.
IMAGE RESPONSE

True

False

Flagging Adult Content


Among the various visual categories is the adult and racy group, which enables detection of adult materials and
restricts the display of images containing sexual content. The filter for adult and racy content detection can be set on
a sliding scale to accommodate the user's preference.

Optical Character Recognition (OCR)


OCR technology detects text content in an image and extracts the identified text into a machine-readable character
stream. You can use the result for search and numerous other purposes like medical records, security, and banking.
It automatically detects the language. OCR saves time and provides convenience for users by allowing them to take
photos of text instead of transcribing the text.
OCR supports 21 languages. These languages are: Chinese Simplified, Chinese Traditional, Czech, Danish, Dutch,
English, Finnish, French, German, Greek, Hungarian, Italian, Japanese, Korean, Norwegian, Polish, Portuguese,
Russian, Spanish, Swedish, and Turkish.
If needed, OCR corrects the rotation of the recognized text, in degrees, around the horizontal image axis. OCR
provides the frame coordinates of each word as seen in below illustration.

Requirements for
OCR:
The size of the input image must be between 40 x 40 and 3200 x 3200 pixels.
The image cannot be bigger than 10 megapixels.
Input image can be rotated by any multiple of 90 degrees plus a small angle of up to '40 degrees.
The accuracy of text recognition depends on the quality of the image. An inaccurate reading may be caused by the
following situations:
Blurry images.
Handwritten or cursive text.
Artistic font styles.
Small text size.
Complex backgrounds, shadows, or glare over text or perspective distortion.
Oversized or missing capital letters at the beginnings of words
Subscript, superscript, or strikethrough text.
Limitations: On photos where text is dominant, false positives may come from partially recognized words. On some
photos, especially photos without any text, precision can vary a lot depending on the type of image.

Recognize Handwritten Text


This technology allows you to detect and extract handwritten text from notes, letters, essays, whiteboards, forms,
etc. It works with different surfaces and backgrounds, such as white paper, yellow sticky notes, and whiteboards.
Handwritten text recognition saves time and effort and can make you more productive by allowing you to take
images of text, rather than having to transcribe it. It makes it possible to digitize notes. This digitization allows you
to implement quick and easy search. It also reduces paper clutter.
Input requirements:
Supported image formats: JPEG, PNG, and BMP.
Image file size must be less than 4 MB.
Image dimensions must be at least 40 x 40, at most 3200 x 3200.
Note: this technology is currently in preview and is only available for English text.

Generating Thumbnails
A thumbnail is a small representation of a full-size image. Varied devices such as phones, tablets, and PCs create a
need for different user experience (UX) layouts and thumbnail sizes. Using smart cropping, this Computer Vision
API feature helps solve the problem.
After uploading an image, a high-quality thumbnail gets generated and the Computer Vision API algorithm
analyzes the objects within the image. It then crops the image to fit the requirements of the 'region of interest'
(ROI). The output gets displayed within a special framework as seen in below illustration. The generated thumbnail
can be presented using an aspect ration that is different from the aspect ratio of the original image to accommodate
a user's needs.
The thumbnail algorithm works as follows:
1. Removes distracting elements from the image and recognizes the main object, the 'region of interest' (ROI).
2. Crops the image based on the identified region of interest.
3. Changes the aspect ratio to fit the target thumbnail dimensions.
Obtaining Subscription Keys
5/24/2017 • 1 min to read • Edit Online

Computer Vision services require special subscription keys. Every call to the Computer Vision API requires a
subscription key. This key needs to be either passed through a query string parameter or specified in the request
header.
To sign up for subscription keys, see Subscriptions. It's free to sign up. Pricing for these services is subject to
change.

NOTE
Your subscription keys are valid for only one of these Microsoft Azure Regions.

REGION ADDRESS

West US westus.api.cognitive.microsoft.com

East US 2 eastus2.api.cognitive.microsoft.com

West Central US westcentralus.api.cognitive.microsoft.com

West Europe westeurope.api.cognitive.microsoft.com

Southeast Asia southeastasia.api.cognitive.microsoft.com

If you sign up using the Computer Vision free trial, your subscription keys are valid for the westcentral region (
https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/ ). That is the most common case. However, if you sign-up for
Computer Vision with your Microsoft Azure account through the https://azure.microsoft.com/ website, you specify
the region for your trial from the preceding list of regions.
For example, if you sign up for Computer Vision with your Microsoft Azure account and you specify westus for
your region, you must use the westus region for your REST API calls ( https://westus.api.cognitive.microsoft.com/vision/v1.0/ ).
If you forget the region for your subscription key after obtaining your trial key, you can find your region at
https://azure.microsoft.com/try/cognitive-services/my-apis/.
Subscriptions
Related Links:
Pricing Options for Microsoft Cognitive APIs
How to Call Computer Vision API
4/12/2017 • 5 min to read • Edit Online

This guide demonstrates how to call Computer Vision API using REST. The samples are written both in C# using the
Computer Vision API client library, and as HTTP POST/GET calls. We will focus on:
How to get "Tags", "Description" and "Categories".
How to get "Domain-specific" information (celebrities).
Image URL or path to locally stored image.
Prerequi si tes

Supported input methods: Raw image binary in the form of an application/octet stream or image URL
Supported image formats: JPEG, PNG, GIF, BMP
Image file size: Less than 4MB
Image dimension: Greater than 50 x 50 pixels
In the examples below, the following features are demonstrated:
1. Analyzing an image and getting an array of tags and a description returned.
2. Analyzing an image with a domain-specific model (specifically, "celebrities" model) and getting the
corresponding result in JSON retune.
Features are broken down on:
Option One: Scoped Analysis - Analyze only a given model
Option Two: Enhanced Analysis - Analyze to provide additional details with 86-categories taxonomy
Every call to the Computer Vision API requires a subscription key. This key needs to be either passed through a
Step 1: Authori ze the API cal l

query string parameter or specified in the request header.


To obtain a subscription key, see How to Obtain Subscription Keys.
1. Passing the subscription key through a query string, see below as a Computer Vision API example:
https://westus.api.cognitive.microsoft.com/vision/v1.0/analyze?visualFeatures=Description,Tags&subscription-key=<Your subscription key>

2. Passing the subscription key can also be specified in the HTTP request header:
ocp-apim-subscription-key: <Your subscription key>

3. When using the client library, the subscription key is passed in through the constructor of VisionServiceClient:
var visionClient = new VisionServiceClient(“Your subscriptionKey”);

The basic way to perform the Computer Vision API call is by uploading an image directly. This is done by sending a
Step 2: Upl oad an i mage to the Computer Vi si on API servi ce and get back tags, descri pti ons and cel ebri ti es

"POST" request with application/octet-stream content type together with the data read from the image. For "Tags"
and "Description", this upload method will be the same for all the Computer Vision API calls. The only difference will
be the query parameters the user specifies.
Here’s how to get "Tags" and "Description" for a given image:
Option One: Get list of "Tags" and one "Description"
POST https://westus.api.cognitive.microsoft.com/vision/v1.0/analyze?visualFeatures=Description,Tags&subscription-key=<Your subscription
key>

using Microsoft.ProjectOxford.Vision;
using Microsoft.ProjectOxford.Vision.Contract;
using System.IO;

AnalysisResult analysisResult;
var features = new VisualFeature[] { VisualFeature.Tags, VisualFeature.Description };

using (var fs = new FileStream(@"C:\Vision\Sample.jpg", FileMode.Open))


{
analysisResult = await visionClient.AnalyzeImageAsync(fs, features);
}

Option Two Get list of "Tags" only, or list of "Description" only:


Ta g s -o n l y :

POST https://westus.api.cognitive.microsoft.com/vision/v1.0/tag&subscription-key=<Your subscription key>


var analysisResult = await visionClient.GetTagsAsync("http://contoso.com/example.jpg");

De s c ri p t i o n -o n l y :

POST https://westus.api.cognitive.microsoft.com/vision/v1.0/describe&subscription-key=<Your subscription key>


using (var fs = new FileStream(@"C:\Vision\Sample.jpg", FileMode.Open))
{
analysisResult = await visionClient.DescribeAsync(fs);
}

Here is how to get domain-specific analysis (in our case, for celebrities).
Option One: Scoped Analysis - Analyze only a given model

POST https://westus.api.cognitive.microsoft.com/vision/v1.0/models/celebrities/analyze
var celebritiesResult = await visionClient.AnalyzeImageInDomainAsync(url, "celebrities");

For this option, all other query parameters {visualFeatures, details} are not valid. If you want to see all supported
models, use:

GET https://westus.api.cognitive.microsoft.com/vision/v1.0/models
var models = await visionClient.ListModelsAsync();

Option Two: Enhanced Analysis - Analyze to provide additional details with 86-categories taxonomy
For applications where you want to get generic image analysis in addition to details from one or more domain-
specific models, we extend the v1 API with the models query parameter.

POST https://westus.api.cognitive.microsoft.com/vision/v1.0/analyze?details=celebrities

When this method is invoked, we will call the 86-category classifier first. If any of the categories match that of a
known/matching model, a second pass of classifier invocations will occur. For example, if "details=all", or "details"
include ‘celebrities’, we will call the celebrities model after the 86-category classifier is called and the result includes
the category person. This will increase latency for users interested in celebrities, compared to Option One.
All v1 query parameters will behave the same in this case. If visualFeatures=categories is not specified, it will be
implicitly enabled.
Here's an example:
Step 3: Retri evi ng and understandi ng the JSO N output for anal yze&vi sual Features=Tags, Descri pti on

{
“tags”: [
{
"name": "outdoor",
"score": 0.976
},
{
"name": "bird",
"score": 0.95
}
],
“description”:
{
"tags": [
"outdoor",
"bird"
],
"captions": [
{
"text”: “partridge in a pear tree”,
“confidence”: 0.96
}
]
}
}

FIELD TYPE CONTENT

Tags object Top-level object for array of tags

tags[].Name string Keyword from tags classifier

tags[].Score number Confidence score, between 0 and 1.

description object Top-level object for a description.

description.tags[] string List of tags. If there insufficient


confidence in the ability to produce a
caption, the tags maybe the only
information available to the caller.

description.captions[].text string A phrase describing the image.

description.captions[].confidence number Confidence for the phrase.

Option One: Scoped Analysis - Analyze only a given model


Step 4: Retri evi ng and understandi ng the JSO N output of domai n-speci fi c model s

The output will be an array of tags, an example will be like this example:
{
"result": [
{
"name": "golden retriever",
"score": 0.98
},
{
"name": "Labrador retriever",
"score": 0.78
}
]
}

Option Two: Enhanced Analysis - Analyze to provide additional details with 86-categories taxonomy
For domain-specific models using Option Two (Enhanced Analysis), the categories return type is extended. An
example follows:

{
"requestId": "87e44580-925a-49c8-b661-d1c54d1b83b5",
"metadata": {
"width": 640,
"height": 430,
"format": "Jpeg"
},
"result": {
"celebrities":
[
{
"name": "Richard Nixon",
"faceRectangle": {
"left": 107,
"top": 98,
"width": 165,
"height": 165
},
"confidence": 0.9999827
}
]
}

The categories field is a list of one or more of the 86-categories in the original taxonomy. Note also that categories
ending in an underscore will match that category and its children (for example, people_ as well as people_group, for
celebrities model).

FIELD TYPE CONTENT

categories object Top-level object

categories[].name string Name from 86-category taxonomy

categories[].score number Confidence score, between 0 and 1

categories[].detail object? Optional detail object

Note that if multiple categories match (for example, 86-category classifier returns a score for both people_ and
people_young when model=celebrities), the details are attached to the most general level match (people_ in that
example.)
These are identical to vision.analyze, with the additional error of NotSupportedModel error (HTTP 400), which may
Errors Responses
be returned in both Option One and Option Two scenarios. For Option Two (Enhanced Analysis), if any of the
models specified in details are not recognized, the API will return a NotSupportedModel, even if one or more of
them are valid. Users can call listModels to find out what models are supported.
These are the basic functionalities of the Computer Vision API: how you can upload images and retrieve valuable
Summary

metadata in return.
To use the REST API, go to Computer Vision API Reference.
How to Analyze Videos in Real-time
5/9/2017 • 7 min to read • Edit Online

This guide will demonstrate how to perform near-real-time analysis on frames taken from a live video stream. The
basic components in such a system are:
Acquire frames from a video source
Select which frames to analyze
Submit these frames to the API
Consume each analysis result that is returned from the API call
These samples are written in C# and the code can be found on GitHub here:
https://github.com/Microsoft/Cognitive-Samples-VideoFrameAnalysis.

The Approach
There are multiple ways to solve the problem of running near-real-time analysis on video streams. We will start by
outlining three approaches in increasing levels of sophistication.
A Simple Approach
The simplest design for a near-real-time analysis system is an infinite loop, where in each iteration we grab a frame,
analyze it, and then consume the result:

while (true)
{
Frame f = GrabFrame();
if (ShouldAnalyze(f))
{
AnalysisResult r = await Analyze(f);
ConsumeResult(r);
}
}

If our analysis consisted of a lightweight client-side algorithm, this approach would be suitable. However, when our
analysis is happening in the cloud, the latency involved means that an API call might take several seconds, during
which time we are not capturing images, and our thread is essentially doing nothing. Our maximum frame-rate is
limited by the latency of the API calls.
Parallelizing API Calls
While a simple single-threaded loop makes sense for a lightweight client-side algorithm, it doesn't fit well with the
latency involved in cloud API calls. The solution to this problem is to allow the long-running API calls to execute in
parallel with the frame-grabbing. In C#, we could achieve this using Task-based parallelism, for example:
while (true)
{
Frame f = GrabFrame();
if (ShouldAnalyze(f))
{
var t = Task.Run(async () =>
{
AnalysisResult r = await Analyze(f);
ConsumeResult(r);
}
}
}

This launches each analysis in a separate Task, which can run in the background while we continue grabbing new
frames. This avoids blocking the main thread while waiting for an API call to return, however we have lost some of
the guarantees that the simple version provided -- multiple API calls might occur in parallel, and the results might
get returned in the wrong order. This could also cause multiple threads to enter the ConsumeResult() function
simultaneously, which could be dangerous, if the function is not thread-safe. Finally, this simple code does not keep
track of the Tasks that get created, so exceptions will silently disappear. Thus, the final ingredient for us to add is a
"consumer" thread that will track the analysis tasks, raise exceptions, kill long-running tasks, and ensure the results
get consumed in the correct order, one at a time.
A Producer-Consumer Design
In our final "producer-consumer" system, we have a producer thread that looks very similar to our previous infinite
loop. However, instead of consuming analysis results as soon as they are available, the producer simply puts the
tasks into a queue to keep track of them.

// Queue that will contain the API call tasks.


var taskQueue = new BlockingCollection<Task<ResultWrapper>>();

// Producer thread.
while (true)
{
// Grab a frame.
Frame f = GrabFrame();

// Decide whether to analyze the frame.


if (ShouldAnalyze(f))
{
// Start a task that will run in parallel with this thread.
var analysisTask = Task.Run(async () =>
{
// Put the frame, and the result/exception into a wrapper object.
var output = new ResultWrapper(f);
try
{
output.Analysis = await Analyze(f);
}
catch (Exception e)
{
output.Exception = e;
}
return output;
}

// Push the task onto the queue.


taskQueue.Add(analysisTask);
}
}

We also have a consumer thread, that is taking tasks off the queue, waiting for them to finish, and either displaying
the result or raising the exception that was thrown. By using the queue, we can guarantee that results get consumed
one at a time, in the correct order, without limiting the maximum frame-rate of the system.

// Consumer thread.
while (true)
{
// Get the oldest task.
Task<ResultWrapper> analysisTask = taskQueue.Take();

// Await until the task is completed.


var output = await analysisTask;

// Consume the exception or result.


if (output.Exception != null)
{
throw output.Exception;
}
else
{
ConsumeResult(output.Analysis);
}
}

Implementing the Solution


Getting Started
To get your app up and running as quickly as possible, we have implemented the system described above,
intending it to be flexible enough to implement many scenarios, while being easy to use. To access the code, go to
https://github.com/Microsoft/Cognitive-Samples-VideoFrameAnalysis.
The library contains the class FrameGrabber, which implements the producer-consumer system discussed above to
process video frames from a webcam. The user can specify the exact form of the API call, and the class uses events
to let the calling code know when a new frame is acquired, or a new analysis result is available.
To illustrate some of the possibilities, there are two sample apps that uses the library. The first is a simple console
app, and a simplified version of this is reproduced below. It grabs frames from the default webcam, and submits
them to the Face API for face detection.
using System;
using VideoFrameAnalyzer;
using Microsoft.ProjectOxford.Face;
using Microsoft.ProjectOxford.Face.Contract;

namespace VideoFrameConsoleApplication
{
class Program
{
static void Main(string[] args)
{
// Create grabber, with analysis type Face[].
FrameGrabber<Face[]> grabber = new FrameGrabber<Face[]>();

// Create Face API Client. Insert your Face API key here.
FaceServiceClient faceClient = new FaceServiceClient("<subscription key>");

// Set up our Face API call.


grabber.AnalysisFunction = async frame => return await faceClient.DetectAsync(frame.Image.ToMemoryStream(".jpg"));

// Set up a listener for when we receive a new result from an API call.
grabber.NewResultAvailable += (s, e) =>
{
if (e.Analysis != null)
Console.WriteLine("New result received for frame acquired at {0}. {1} faces detected", e.Frame.Metadata.Timestamp,
e.Analysis.Length);
};

// Tell grabber to call the Face API every 3 seconds.


grabber.TriggerAnalysisOnInterval(TimeSpan.FromMilliseconds(3000));

// Start running.
grabber.StartProcessingCameraAsync().Wait();

// Wait for keypress to stop


Console.WriteLine("Press any key to stop...");
Console.ReadKey();

// Stop, blocking until done.


grabber.StopProcessingAsync().Wait();
}
}
}

The second sample app is a bit more interesting, and allows you to choose which API to call on the video frames. On
the left hand side, the app shows a preview of the live video, on the right hand side it shows the most recent API
result overlaid on the corresponding frame.
In most modes, there will be a visible delay between the live video on the left, and the visualized analysis on the
right. This delay is the time taken to make the API call. The exception to this is in the
"EmotionsWithClientFaceDetect" mode, which performs face detection locally on the client computer using
OpenCV, before submitting any images to Cognitive Services. By doing this, we can visualize the detected face
immediately, and then update the emotions later once the API call returns. This demonstrates the possibility of a
"hybrid" approach, where some simple processing can be performed on the client, and then Cognitive Services APIs
can be used to augment this with more advanced analysis when necessary.
Integrating into your codebase
To get started with this sample, follow these steps:
1. Get API keys for the Vision APIs from Subscriptions. For video frame analysis, the applicable APIs are:
Computer Vision API
Emotion API
Face API
2. Clone the Cognitive-Samples-VideoFrameAnalysis GitHub repo
3. Open the sample in Visual Studio 2015, build and run the sample applications:
For BasicConsoleSample, the Face API key is hard-coded directly in BasicConsoleSample/Program.cs.
For LiveCameraSample, the keys should be entered into the Settings pane of the app. They will be
persisted across sessions as user data.
When you're ready to integrate, simply reference the VideoFrameAnalyzer library from your own projects.

Developer Code of Conduct


As with all the Cognitive Services, Developers developing with our APIs and samples are required to follow the
"Developer Code of Conduct for Microsoft Cognitive Services."
The image, voice, video or text understanding capabilities of VideoFrameAnalyzer uses Microsoft Cognitive Services.
Microsoft will receive the images, audio, video, and other data that you upload (via this app) and may use them for
service improvement purposes. We ask for your help in protecting the people whose data your app sends to
Microsoft Cognitive Services.

Summary
In this guide, you learned how to run near-real-time analysis on live video streams using the Face, Computer Vision,
and Emotion APIs, and how you can use our sample code to get started. You can get started building your app with
free API keys at the Microsoft Cognitive Services sign-up page.
Please feel free to provide feedback and suggestions in the GitHub repository, or for more broad API feedback, on
our UserVoice site.
Computer Vision cURL Quick Starts
5/25/2017 • 3 min to read • Edit Online

This article provides information and code samples to help you quickly get started using the Computer Vision API
with cURL to accomplish the following tasks:
Analyze an image
Intelligently generate a thumbnail
Detect and extract text from an Image
Learn more about obtaining free Subscription Keys here

Analyze an Image With Computer Vision API Using cURL


With the Analyze Image method, you can extract visual features based on image content. You can upload an image
or specify an image URL and choose which features to return, including:
The category defined in this taxonomy.
A detailed list of tags related to the image content.
A description of image content in a complete sentence.
The coordinates, gender, and age of any faces contained in the image.
The ImageType (clipart or a line drawing)
The dominant color, the accent color, or whether an image is black & white.
Whether the image contains pornographic or sexually suggestive content.
Analyze an Image curl Example Request
Change the URL to use the location where you obtained your subscription keys, and replace the "Ocp-Apim-
Subscription-Key" value with your valid subscription key.

NOTE
You must use the same location in your REST call as you used to obtain your subscription keys. For example, if you obtained
your subscription keys from westus, replace "westcentralus" in the URL below with "westus".

@ECHO OFF

curl -v -X POST "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/analyze?visualFeatures=Categories&details=


{string}&language=en"
-H "Content-Type: application/json"
-H "Ocp-Apim-Subscription-Key: {subscription key}"

--data-ascii "{body}"

Analyze an Image Response


A successful response is returned in JSON. Following is an example of a successful response:

{
"categories": [
{
"name": "abstract_",
"score": 0.00390625
"score": 0.00390625
},
{
"name": "people_",
"score": 0.83984375,
"detail": {
"celebrities": [
{
"name": "Satya Nadella",
"faceRectangle": {
"left": 597,
"top": 162,
"width": 248,
"height": 248
},
"confidence": 0.999028444
}
]
}
}
],
"adult": {
"isAdultContent": false,
"isRacyContent": false,
"adultScore": 0.0934349000453949,
"racyScore": 0.068613491952419281
},
"tags": [
{
"name": "person",
"confidence": 0.98979085683822632
},
{
"name": "man",
"confidence": 0.94493889808654785
},
{
"name": "outdoor",
"confidence": 0.938492476940155
},
{
"name": "window",
"confidence": 0.89513939619064331
}
],
"description": {
"tags": [
"person",
"man",
"outdoor",
"window",
"glasses"
],
"captions": [
{
"text": "Satya Nadella sitting on a bench",
"confidence": 0.48293603002174407
}
]
},
"requestId": "0dbec5ad-a3d3-4f7e-96b4-dfd57efe967d",
"metadata": {
"width": 1500,
"height": 1000,
"format": "Jpeg"
},
"faces": [
{
"age": 44,
"gender": "Male",
"gender": "Male",
"faceRectangle": {
"left": 593,
"top": 160,
"width": 250,
"height": 250
}
}
],
"color": {
"dominantColorForeground": "Brown",
"dominantColorBackground": "Brown",
"dominantColors": [
"Brown",
"Black"
],
"accentColor": "873B59",
"isBWImg": false
},
"imageType": {
"clipArtType": 0,
"lineDrawingType": 0
}
}

Get a Thumbnail with Computer Vision API Using curl


Use the Get Thumbnail method to crop an image based on its region of interest (ROI) to the height and width you
desire, even if the aspect ratio differs from the input image.
Get a Thumbnail curl Example Request
Change the URL to use the location where you obtained your subscription keys, and replace the "Ocp-Apim-
Subscription-Key" value with your valid subscription key.

NOTE
You must use the same location in your REST call as you used to obtain your subscription keys. For example, if you obtained
your subscription keys from westus, replace "westcentralus" in the URL below with "westus".

@ECHO OFF

curl -v -X POST "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/generateThumbnail?width={number}&height=


{number}&smartCropping=true"
-H "Content-Type: application/json"
-H "Ocp-Apim-Subscription-Key: {subscription key}"

--data-ascii "{body}"

Get a Thumbnail Response


A successful response contains the thumbnail image binary. If the request failed, the response contains an error
code and a message to help determine what went wrong.

Optical Character Recognition (OCR) with Computer Vision API Using


curl
Use the Optical Character Recognition (OCR) method to detect text in an image and extract recognized characters
into a machine-usable character stream.
OCR curl Example Request
Change the URL to use the location where you obtained your subscription keys, and replace the "Ocp-Apim-
Subscription-Key" value with your valid subscription key.

NOTE
You must use the same location in your REST call as you used to obtain your subscription keys. For example, if you obtained
your subscription keys from westus, replace "westcentralus" in the URL below with "westus".

@ECHO OFF

curl -v -X POST "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/ocr?language=unk&detectOrientation =true"


-H "Content-Type: application/json"
-H "Ocp-Apim-Subscription-Key: {subscription key}"

--data-ascii "{body}"

OCR Example Response


Upon success, the OCR results returned include text, bounding box for regions, lines, and words.
{
"language": "en",
"textAngle": -2.0000000000000338,
"orientation": "Up",
"regions": [
{
"boundingBox": "462,379,497,258",
"lines": [
{
"boundingBox": "462,379,497,74",
"words": [
{
"boundingBox": "462,379,41,73",
"text": "A"
},
{
"boundingBox": "523,379,153,73",
"text": "GOAL"
},
{
"boundingBox": "694,379,265,74",
"text": "WITHOUT"
}
]
},
{
"boundingBox": "565,471,289,74",
"words": [
{
"boundingBox": "565,471,41,73",
"text": "A"
},
{
"boundingBox": "626,471,150,73",
"text": "PLAN"
},
{
"boundingBox": "801,472,53,73",
"text": "IS"
}
]
},
{
"boundingBox": "519,563,375,74",
"words": [
{
"boundingBox": "519,563,149,74",
"text": "JUST"
},
{
"boundingBox": "683,564,41,72",
"text": "A"
},
{
"boundingBox": "741,564,153,73",
"text": "WISH"
}
]
}
]
}
]
}
Computer Vision C# Quick Starts
6/13/2017 • 20 min to read • Edit Online

This article provides information and code samples to help you quickly get started using the Computer Vision API
with C# to accomplish the following tasks:
Analyze an image
Use a Domain-Specific Model
Intelligently generate a thumbnail
Detect and extract printed text from an image
Detect and extract handwritten text from an image

Prerequisites
Get the Microsoft Computer Vision API Windows SDK here.
To use the Computer Vision API, you need a subscription key. You can get free subscription keys here.

Analyze an Image With Computer Vision API using C#


With the Analyze Image method, you can extract visual features based on image content. You can upload an image
or specify an image URL and choose which features to return, including:
A detailed list of tags related to the image content.
A description of image content in a complete sentence.
The coordinates, gender, and age of any faces contained in the image.
The ImageType (clip art or a line drawing).
The dominant color, the accent color, or whether an image is black & white.
The category defined in this taxonomy.
Does the image contain adult or sexually suggestive content?
Analyze an image C# example request
Create a new Console solution in Visual Studio, then replace Program.cs with the following code. Change the
uriBase to use the location where you obtained your subscription keys, and replace the subscriptionKey value with
your valid subscription key.

using System;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;

namespace CSHttpClientSample
{
static class Program
{
// **********************************************
// *** Update or verify the following values. ***
// **********************************************

// Replace the subscriptionKey string value with your valid subscription key.
const string subscriptionKey = "13hc77781f7e4b19b5fcdd72a8df7156";

// Replace or verify the region.


//
// You must use the same region in your REST API call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from the westus region, replace
// "westcentralus" in the URI below with "westus".
//
// NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
// a free trial subscription key, you should not need to change this region.
const string uriBase = "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/analyze";

static void Main()


{
// Get the path and filename to process from the user.
Console.WriteLine("Analyze an image:");
Console.Write("Enter the path to an image you wish to analzye: ");
string imageFilePath = Console.ReadLine();

// Execute the REST API call.


MakeAnalysisRequest(imageFilePath);

Console.WriteLine("\nPlease wait a moment for the results to appear. Then, press Enter to exit...\n");
Console.ReadLine();
}

/// <summary>
/// Gets the analysis of the specified image file by using the Computer Vision REST API.
/// </summary>
/// <param name="imageFilePath">The image file.</param>
static async void MakeAnalysisRequest(string imageFilePath)
{
HttpClient client = new HttpClient();

// Request headers.
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", subscriptionKey);

// Request parameters. A third optional parameter is "details".


string requestParameters = "visualFeatures=Categories,Description,Color&language=en";

// Assemble the URI for the REST API Call.


string uri = uriBase + "?" + requestParameters;

HttpResponseMessage response;

// Request body. Posts a locally stored JPEG image.


byte[] byteData = GetImageAsByteArray(imageFilePath);

using (ByteArrayContent content = new ByteArrayContent(byteData))


{
// This example uses content type "application/octet-stream".
// The other content types you can use are "application/json" and "multipart/form-data".
content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");

// Execute the REST API call.


response = await client.PostAsync(uri, content);

// Get the JSON response.


string contentString = await response.Content.ReadAsStringAsync();

// Display the JSON response.


Console.WriteLine("\nResponse:\n");
Console.WriteLine(JsonPrettyPrint(contentString));
}
}

/// <summary>
/// Returns the contents of the specified file as a byte array.
/// </summary>
/// </summary>
/// <param name="imageFilePath">The image file to read.</param>
/// <returns>The byte array of the image data.</returns>
static byte[] GetImageAsByteArray(string imageFilePath)
{
FileStream fileStream = new FileStream(imageFilePath, FileMode.Open, FileAccess.Read);
BinaryReader binaryReader = new BinaryReader(fileStream);
return binaryReader.ReadBytes((int)fileStream.Length);
}

/// <summary>
/// Formats the given JSON string by adding line breaks and indents.
/// </summary>
/// <param name="json">The raw JSON string to format.</param>
/// <returns>The formatted JSON string.</returns>
static string JsonPrettyPrint(string json)
{
if (string.IsNullOrEmpty(json))
return string.Empty;

json = json.Replace(Environment.NewLine, "").Replace("\t", "");

StringBuilder sb = new StringBuilder();


bool quote = false;
bool ignore = false;
int offset = 0;
int indentLength = 3;

foreach(char ch in json)
{
switch (ch)
{
case '"':
if (!ignore) quote = !quote;
break;
case '\'':
if (quote) ignore = !ignore;
break;
}

if (quote)
sb.Append(ch);
else
{
switch (ch)
{
case '{':
case '[':
sb.Append(ch);
sb.Append(Environment.NewLine);
sb.Append(new string(' ', ++offset * indentLength));
break;
case '}':
case ']':
sb.Append(Environment.NewLine);
sb.Append(new string(' ', --offset * indentLength));
sb.Append(ch);
break;
case ',':
sb.Append(ch);
sb.Append(Environment.NewLine);
sb.Append(new string(' ', offset * indentLength));
break;
case ':':
sb.Append(ch);
sb.Append(' ');
break;
default:
if (ch != ' ') sb.Append(ch);
if (ch != ' ') sb.Append(ch);
break;
}
}
}

return sb.ToString().Trim();
}
}
}

Analyze an Image response


A successful response is returned in JSON. Following is an example of a successful response:
{
"categories": [
{
"name": "abstract_",
"score": 0.00390625
},
{
"name": "others_",
"score": 0.0234375
},
{
"name": "outdoor_",
"score": 0.00390625
}
],
"description": {
"tags": [
"road",
"building",
"outdoor",
"street",
"night",
"black",
"city",
"white",
"light",
"sitting",
"riding",
"man",
"side",
"empty",
"rain",
"corner",
"traffic",
"lit",
"hydrant",
"stop",
"board",
"parked",
"bus",
"tall"
],
"captions": [
{
"text": "a close up of an empty city street at night",
"confidence": 0.7965622853462756
}
]
},
"requestId": "dddf1ac9-7e66-4c47-bdef-222f3fe5aa23",
"metadata": {
"width": 3733,
"height": 1986,
"format": "Jpeg"
},
"color": {
"dominantColorForeground": "Black",
"dominantColorBackground": "Black",
"dominantColors": [
"Black",
"Grey"
],
"accentColor": "666666",
"isBWImg": true
}
}
Use a Domain-Specific Model
The Domain-Specific Model is a model trained to identify a specific set of objects in an image. The two domain-
specific models that are currently available are celebrities and landmarks. The following example identifies a
landmark in an image.
Landmark C# example request
Create a new Console solution in Visual Studio, then replace Program.cs with the following code. Change the
uriBase to use the location where you obtained your subscription keys, and replace the subscriptionKey value with
your valid subscription key.

using System;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;

namespace CSHttpClientSample
{
static class Program
{
// **********************************************
// *** Update or verify the following values. ***
// **********************************************

// Replace the subscriptionKey string value with your valid subscription key.
const string subscriptionKey = "13hc77781f7e4b19b5fcdd72a8df7156";

// Replace or verify the region.


//
// You must use the same region in your REST API call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from the westus region, replace
// "westcentralus" in the URI below with "westus".
//
// NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
// a free trial subscription key, you should not need to change this region.
//
// Also, if you want to use the celebrities model, change "landmarks" to "celebrities" here and in
// requestParameters to use the Celebrities model.
const string uriBase = "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/models/landmarks/analyze";

static void Main()


{
// Get the path and filename to process from the user.
Console.WriteLine("Domain-Specific Model:");
Console.Write("Enter the path to an image you wish to analzye for landmarks: ");
string imageFilePath = Console.ReadLine();

// Execute the REST API call.


MakeAnalysisRequest(imageFilePath);

Console.WriteLine("\nPlease wait a moment for the results to appear. Then, press Enter to exit ...\n");
Console.ReadLine();
}

/// <summary>
/// Gets a thumbnail image from the specified image file by using the Computer Vision REST API.
/// </summary>
/// <param name="imageFilePath">The image file to use to create the thumbnail image.</param>
static async void MakeAnalysisRequest(string imageFilePath)
{
HttpClient client = new HttpClient();

// Request headers.
// Request headers.
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", subscriptionKey);

// Request parameters. Change "landmarks" to "celebrities" here and in uriBase to use the Celebrities model.
string requestParameters = "model=landmarks";

// Assemble the URI for the REST API Call.


string uri = uriBase + "?" + requestParameters;

HttpResponseMessage response;

// Request body. Posts a locally stored JPEG image.


byte[] byteData = GetImageAsByteArray(imageFilePath);

using (ByteArrayContent content = new ByteArrayContent(byteData))


{
// This example uses content type "application/octet-stream".
// The other content types you can use are "application/json" and "multipart/form-data".
content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");

// Execute the REST API call.


response = await client.PostAsync(uri, content);

// Get the JSON response.


string contentString = await response.Content.ReadAsStringAsync();

// Display the JSON response.


Console.WriteLine("\nResponse:\n");
Console.WriteLine(JsonPrettyPrint(contentString));
}
}

/// <summary>
/// Returns the contents of the specified file as a byte array.
/// </summary>
/// <param name="imageFilePath">The image file to read.</param>
/// <returns>The byte array of the image data.</returns>
static byte[] GetImageAsByteArray(string imageFilePath)
{
FileStream fileStream = new FileStream(imageFilePath, FileMode.Open, FileAccess.Read);
BinaryReader binaryReader = new BinaryReader(fileStream);
return binaryReader.ReadBytes((int)fileStream.Length);
}

/// <summary>
/// Formats the given JSON string by adding line breaks and indents.
/// </summary>
/// <param name="json">The raw JSON string to format.</param>
/// <returns>The formatted JSON string.</returns>
static string JsonPrettyPrint(string json)
{
if (string.IsNullOrEmpty(json))
return string.Empty;

json = json.Replace(Environment.NewLine, "").Replace("\t", "");

StringBuilder sb = new StringBuilder();


bool quote = false;
bool ignore = false;
int offset = 0;
int indentLength = 3;

foreach (char ch in json)


{
switch (ch)
{
case '"':
if (!ignore) quote = !quote;
if (!ignore) quote = !quote;
break;
case '\'':
if (quote) ignore = !ignore;
break;
}

if (quote)
sb.Append(ch);
else
{
switch (ch)
{
case '{':
case '[':
sb.Append(ch);
sb.Append(Environment.NewLine);
sb.Append(new string(' ', ++offset * indentLength));
break;
case '}':
case ']':
sb.Append(Environment.NewLine);
sb.Append(new string(' ', --offset * indentLength));
sb.Append(ch);
break;
case ',':
sb.Append(ch);
sb.Append(Environment.NewLine);
sb.Append(new string(' ', offset * indentLength));
break;
case ':':
sb.Append(ch);
sb.Append(' ');
break;
default:
if (ch != ' ') sb.Append(ch);
break;
}
}
}

return sb.ToString().Trim();
}
}
}

Landmark example response


A successful response is returned in JSON. Following is an example of a successful response:

{
"requestId": "cfe3d4eb-4d9c-4dda-ae63-7d3a27ce6d27",
"metadata": {
"width": 1024,
"height": 680,
"format": "Jpeg"
},
"result": {
"landmarks": [
{
"name": "Space Needle",
"confidence": 0.9448209
}
]
}
}
Get a thumbnail with Computer Vision API using C#
Use the Get Thumbnail method to crop an image based on its region of interest (ROI) to the height and width you
desire. You can even pick an aspect ratio that differs from the aspect ratio of the input image.
Get a thumbnail C# example request
Create a new Console solution in Visual Studio, then replace Program.cs with the following code. Change the
uriBase to use the location where you obtained your subscription keys, and replace the subscriptionKey value with
your valid subscription key.

using System;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;

namespace CSHttpClientSample
{
static class Program
{
// **********************************************
// *** Update or verify the following values. ***
// **********************************************

// Replace the subscriptionKey string value with your valid subscription key.
const string subscriptionKey = "13hc77781f7e4b19b5fcdd72a8df7156";

// Replace or verify the region.


//
// You must use the same region in your REST API call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from the westus region, replace
// "westcentralus" in the URI below with "westus".
//
// NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
// a free trial subscription key, you should not need to change this region.
const string uriBase = "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/generateThumbnail";

static void Main()


{
// Get the path and filename to process from the user.
Console.WriteLine("Thumbnail:");
Console.Write("Enter the path to an image you wish to use to create a thumbnail image: ");
string imageFilePath = Console.ReadLine();

// Execute the REST API call.


MakeThumbNailRequest(imageFilePath);

Console.WriteLine("\nPlease wait a moment for the results to appear. Then, press Enter to exit ...\n");
Console.ReadLine();
}

/// <summary>
/// Gets a thumbnail image from the specified image file by using the Computer Vision REST API.
/// </summary>
/// <param name="imageFilePath">The image file to use to create the thumbnail image.</param>
static async void MakeThumbNailRequest(string imageFilePath)
{
HttpClient client = new HttpClient();

// Request headers.
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", subscriptionKey);

// Request parameters.
string requestParameters = "width=200&height=150&smartCropping=true";
string requestParameters = "width=200&height=150&smartCropping=true";

// Assemble the URI for the REST API Call.


string uri = uriBase + "?" + requestParameters;

HttpResponseMessage response;

// Request body. Posts a locally stored JPEG image.


byte[] byteData = GetImageAsByteArray(imageFilePath);

using (ByteArrayContent content = new ByteArrayContent(byteData))


{
// This example uses content type "application/octet-stream".
// The other content types you can use are "application/json" and "multipart/form-data".
content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");

// Execute the REST API call.


response = await client.PostAsync(uri, content);

if (response.IsSuccessStatusCode)
{
// Display the response data.
Console.WriteLine("\nResponse:\n");
Console.WriteLine(response);

// Get the image data.


byte[] thumbnailImageData = await response.Content.ReadAsByteArrayAsync();
}
else
{
// Display the JSON error data.
Console.WriteLine("\nError:\n");
Console.WriteLine(JsonPrettyPrint(await response.Content.ReadAsStringAsync()));
}
}
}

/// <summary>
/// Returns the contents of the specified file as a byte array.
/// </summary>
/// <param name="imageFilePath">The image file to read.</param>
/// <returns>The byte array of the image data.</returns>
static byte[] GetImageAsByteArray(string imageFilePath)
{
FileStream fileStream = new FileStream(imageFilePath, FileMode.Open, FileAccess.Read);
BinaryReader binaryReader = new BinaryReader(fileStream);
return binaryReader.ReadBytes((int)fileStream.Length);
}

/// <summary>
/// Formats the given JSON string by adding line breaks and indents.
/// </summary>
/// <param name="json">The raw JSON string to format.</param>
/// <returns>The formatted JSON string.</returns>
static string JsonPrettyPrint(string json)
{
if (string.IsNullOrEmpty(json))
return string.Empty;

json = json.Replace(Environment.NewLine, "").Replace("\t", "");

StringBuilder sb = new StringBuilder();


bool quote = false;
bool ignore = false;
int offset = 0;
int indentLength = 3;

foreach (char ch in json)


foreach (char ch in json)
{
switch (ch)
{
case '"':
if (!ignore) quote = !quote;
break;
case '\'':
if (quote) ignore = !ignore;
break;
}

if (quote)
sb.Append(ch);
else
{
switch (ch)
{
case '{':
case '[':
sb.Append(ch);
sb.Append(Environment.NewLine);
sb.Append(new string(' ', ++offset * indentLength));
break;
case '}':
case ']':
sb.Append(Environment.NewLine);
sb.Append(new string(' ', --offset * indentLength));
sb.Append(ch);
break;
case ',':
sb.Append(ch);
sb.Append(Environment.NewLine);
sb.Append(new string(' ', offset * indentLength));
break;
case ':':
sb.Append(ch);
sb.Append(' ');
break;
default:
if (ch != ' ') sb.Append(ch);
break;
}
}
}

return sb.ToString().Trim();
}
}
}

Get a Thumbnail response


A successful response contains the thumbnail image binary. If the request fails, the response contains an error code
and a message to help determine what went wrong.
Response:

StatusCode: 200, ReasonPhrase: 'OK', Version: 1.1, Content: System.Net.Http.StreamContent, Headers:


{
Pragma: no-cache
apim-request-id: 131eb5b4-5807-466d-9656-4c1ef0a64c9b
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
x-content-type-options: nosniff
Cache-Control: no-cache
Date: Tue, 06 Jun 2017 20:54:07 GMT
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
Content-Length: 5800
Content-Type: image/jpeg
Expires: -1
}

Optical Character Recognition (OCR) with Computer Vision API using


C#
Use the Optical Character Recognition (OCR) method to detect printed text in an image and extract recognized
characters into a machine-usable character stream.
OCR C# example request
Create a new Console solution in Visual Studio, then replace Program.cs with the following code. Change the
uriBase to use the location where you obtained your subscription keys, and replace the subscriptionKey value with
your valid subscription key.

using System;
using System.IO;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;

namespace CSHttpClientSample
{
static class Program
{
// **********************************************
// *** Update or verify the following values. ***
// **********************************************

// Replace the subscriptionKey string value with your valid subscription key.
const string subscriptionKey = "13hc77781f7e4b19b5fcdd72a8df7156";

// Replace or verify the region.


//
// You must use the same region in your REST API call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from the westus region, replace
// "westcentralus" in the URI below with "westus".
//
// NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
// a free trial subscription key, you should not need to change this region.
const string uriBase = "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/ocr";

static void Main()


{
// Get the path and filename to process from the user.
Console.WriteLine("Optical Character Recognition:");
Console.Write("Enter the path to an image with text you wish to read: ");
string imageFilePath = Console.ReadLine();
// Execute the REST API call.
MakeOCRRequest(imageFilePath);

Console.WriteLine("\nPlease wait a moment for the results to appear. Then, press Enter to exit...\n");
Console.ReadLine();
}

/// <summary>
/// Gets the text visible in the specified image file by using the Computer Vision REST API.
/// </summary>
/// <param name="imageFilePath">The image file.</param>
static async void MakeOCRRequest(string imageFilePath)
{
HttpClient client = new HttpClient();

// Request headers.
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", subscriptionKey);

// Request parameters.
string requestParameters = "language=unk&detectOrientation=true";

// Assemble the URI for the REST API Call.


string uri = uriBase + "?" + requestParameters;

HttpResponseMessage response;

// Request body. Posts a locally stored JPEG image.


byte[] byteData = GetImageAsByteArray(imageFilePath);

using (ByteArrayContent content = new ByteArrayContent(byteData))


{
// This example uses content type "application/octet-stream".
// The other content types you can use are "application/json" and "multipart/form-data".
content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");

// Execute the REST API call.


response = await client.PostAsync(uri, content);

// Get the JSON response.


string contentString = await response.Content.ReadAsStringAsync();

// Display the JSON response.


Console.WriteLine("\nResponse:\n");
Console.WriteLine(JsonPrettyPrint(contentString));
}
}

/// <summary>
/// Returns the contents of the specified file as a byte array.
/// </summary>
/// <param name="imageFilePath">The image file to read.</param>
/// <returns>The byte array of the image data.</returns>
static byte[] GetImageAsByteArray(string imageFilePath)
{
FileStream fileStream = new FileStream(imageFilePath, FileMode.Open, FileAccess.Read);
BinaryReader binaryReader = new BinaryReader(fileStream);
return binaryReader.ReadBytes((int)fileStream.Length);
}

/// <summary>
/// Formats the given JSON string by adding line breaks and indents.
/// </summary>
/// <param name="json">The raw JSON string to format.</param>
/// <returns>The formatted JSON string.</returns>
static string JsonPrettyPrint(string json)
{
{
if (string.IsNullOrEmpty(json))
return string.Empty;

json = json.Replace(Environment.NewLine, "").Replace("\t", "");

StringBuilder sb = new StringBuilder();


bool quote = false;
bool ignore = false;
int offset = 0;
int indentLength = 3;

foreach (char ch in json)


{
switch (ch)
{
case '"':
if (!ignore) quote = !quote;
break;
case '\'':
if (quote) ignore = !ignore;
break;
}

if (quote)
sb.Append(ch);
else
{
switch (ch)
{
case '{':
case '[':
sb.Append(ch);
sb.Append(Environment.NewLine);
sb.Append(new string(' ', ++offset * indentLength));
break;
case '}':
case ']':
sb.Append(Environment.NewLine);
sb.Append(new string(' ', --offset * indentLength));
sb.Append(ch);
break;
case ',':
sb.Append(ch);
sb.Append(Environment.NewLine);
sb.Append(new string(' ', offset * indentLength));
break;
case ':':
sb.Append(ch);
sb.Append(' ');
break;
default:
if (ch != ' ') sb.Append(ch);
break;
}
}
}

return sb.ToString().Trim();
}
}
}

OCR Example Response


Upon success, the OCR results returned include text, bounding box for regions, lines, and words.

{
{
"language": "en",
"textAngle": -1.5000000000000335,
"orientation": "Up",
"regions": [
{
"boundingBox": "154,49,351,575",
"lines": [
{
"boundingBox": "165,49,340,117",
"words": [
{
"boundingBox": "165,49,63,109",
"text": "A"
},
{
"boundingBox": "261,50,244,116",
"text": "GOAL"
}
]
},
{
"boundingBox": "165,169,339,93",
"words": [
{
"boundingBox": "165,169,339,93",
"text": "WITHOUT"
}
]
},
{
"boundingBox": "159,264,342,117",
"words": [
{
"boundingBox": "159,264,64,110",
"text": "A"
},
{
"boundingBox": "255,266,246,115",
"text": "PLAN"
}
]
},
{
"boundingBox": "161,384,338,119",
"words": [
{
"boundingBox": "161,384,86,113",
"text": "IS"
},
{
"boundingBox": "274,387,225,116",
"text": "JUST"
}
]
},
{
"boundingBox": "154,506,341,118",
"words": [
{
"boundingBox": "154,506,62,111",
"text": "A"
},
{
"boundingBox": "248,508,247,116",
"text": "WISH"
}
]
}
]
]
}
]
}

Text recognition with Computer Vision API using C#


Use the RecognizeText method to detect handwritten or printed text in an image and extract recognized characters
into a machine-usable character stream.
Handwriting recognition C# example
Create a new Console solution in Visual Studio, then replace Program.cs with the following code. Change the
uriBase to use the location where you obtained your subscription keys, and replace the subscriptionKey value with
your valid subscription key.

using System;
using System.IO;
using System.Linq;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;

namespace CSHttpClientSample
{
static class Program
{
// **********************************************
// *** Update or verify the following values. ***
// **********************************************

// Replace the subscriptionKey string value with your valid subscription key.
const string subscriptionKey = "13hc77781f7e4b19b5fcdd72a8df7156";

// Replace or verify the region.


//
// You must use the same region in your REST API call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from the westus region, replace
// "westcentralus" in the URI below with "westus".
//
// NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
// a free trial subscription key, you should not need to change this region.
const string uriBase = "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/recognizeText";

static void Main()


{
// Get the path and filename to process from the user.
Console.WriteLine("Handwriting Recognition:");
Console.Write("Enter the path to an image with handwritten text you wish to read: ");
string imageFilePath = Console.ReadLine();

// Execute the REST API call.


ReadHandwrittenText(imageFilePath);

Console.WriteLine("\nPlease wait a moment for the results to appear. Then, press Enter to exit...\n");
Console.ReadLine();
}

/// <summary>
/// Gets the handwritten text from the specified image file by using the Computer Vision REST API.
/// </summary>
/// <param name="imageFilePath">The image file with handwritten text.</param>
static async void ReadHandwrittenText(string imageFilePath)
{
HttpClient client = new HttpClient();

// Request headers.
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", subscriptionKey);

// Request parameter. Set "handwriting" to false for printed text.


string requestParameters = "handwriting=true";

// Assemble the URI for the REST API Call.


string uri = uriBase + "?" + requestParameters;

HttpResponseMessage response = null;

// This operation requrires two REST API calls. One to submit the image for processing,
// the other to retrieve the text found in the image. This value stores the REST API
// location to call to retrieve the text.
string operationLocation = null;

// Request body. Posts a locally stored JPEG image.


byte[] byteData = GetImageAsByteArray(imageFilePath);
ByteArrayContent content = new ByteArrayContent(byteData);

// This example uses content type "application/octet-stream".


// You can also use "application/json" and specify an image URL.
content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");

// The first REST call starts the async process to analyze the written text in the image.
response = await client.PostAsync(uri, content);

// The response contains the URI to retrieve the result of the process.
if (response.IsSuccessStatusCode)
operationLocation = response.Headers.GetValues("Operation-Location").FirstOrDefault();
else
{
// Display the JSON error data.
Console.WriteLine("\nError:\n");
Console.WriteLine(JsonPrettyPrint(await response.Content.ReadAsStringAsync()));
return;
}

// The second REST call retrieves the text written in the image.
//
// Note: The response may not be immediately available. Handwriting recognition is an
// async operation that can take a variable amount of time depending on the length
// of the handwritten text. You may need to wait or retry this operation.
//
// This example checks once per second for ten seconds.
string contentString;
int i = 0;
do
{
System.Threading.Thread.Sleep(1000);
response = await client.GetAsync(operationLocation);
contentString = await response.Content.ReadAsStringAsync();
++i;
}
while (i < 10 && contentString.IndexOf("\"status\":\"Succeeded\"") == -1);

if (i == 10 && contentString.IndexOf("\"status\":\"Succeeded\"") == -1)


{
Console.WriteLine("\nTimeout error.\n");
return;
}

// Display the JSON response.


Console.WriteLine("\nResponse:\n");
Console.WriteLine(JsonPrettyPrint(contentString));
}
/// <summary>
/// Returns the contents of the specified file as a byte array.
/// </summary>
/// <param name="imageFilePath">The image file to read.</param>
/// <returns>The byte array of the image data.</returns>
static byte[] GetImageAsByteArray(string imageFilePath)
{
FileStream fileStream = new FileStream(imageFilePath, FileMode.Open, FileAccess.Read);
BinaryReader binaryReader = new BinaryReader(fileStream);
return binaryReader.ReadBytes((int)fileStream.Length);
}

/// <summary>
/// Formats the given JSON string by adding line breaks and indents.
/// </summary>
/// <param name="json">The raw JSON string to format.</param>
/// <returns>The formatted JSON string.</returns>
static string JsonPrettyPrint(string json)
{
if (string.IsNullOrEmpty(json))
return string.Empty;

json = json.Replace(Environment.NewLine, "").Replace("\t", "");

StringBuilder sb = new StringBuilder();


bool quote = false;
bool ignore = false;
int offset = 0;
int indentLength = 3;

foreach (char ch in json)


{
switch (ch)
{
case '"':
if (!ignore) quote = !quote;
break;
case '\'':
if (quote) ignore = !ignore;
break;
}

if (quote)
sb.Append(ch);
else
{
switch (ch)
{
case '{':
case '[':
sb.Append(ch);
sb.Append(Environment.NewLine);
sb.Append(new string(' ', ++offset * indentLength));
break;
case '}':
case ']':
sb.Append(Environment.NewLine);
sb.Append(new string(' ', --offset * indentLength));
sb.Append(ch);
break;
case ',':
sb.Append(ch);
sb.Append(Environment.NewLine);
sb.Append(new string(' ', offset * indentLength));
break;
case ':':
sb.Append(ch);
sb.Append(ch);
sb.Append(' ');
break;
default:
if (ch != ' ') sb.Append(ch);
break;
}
}
}

return sb.ToString().Trim();
}
}
}

Handwriting recognition response


A successful response is returned in JSON. Following is an example of a successful response:

{
"status": "Succeeded",
"recognitionResult": {
"lines": [
{
"boundingBox": [
99,
195,
1309,
45,
1340,
292,
130,
442
],
"text": "when you write them down",
"words": [
{
"boundingBox": [
152,
191,
383,
154,
341,
421,
110,
458
],
"text": "when"
},
{
"boundingBox": [
436,
145,
607,
118,
565,
385,
394,
412
],
"text": "you"
},
{
"boundingBox": [
644,
112,
873,
76,
831,
343,
602,
379
],
"text": "write"
},
{
"boundingBox": [
895,
72,
1092,
41,
1050,
308,
853,
339
],
"text": "them"
},
{
"boundingBox": [
1140,
33,
1400,
0,
1359,
258,
1098,
300
],
"text": "down"
}
]
},
{
"boundingBox": [
142,
222,
1252,
62,
1269,
180,
159,
340
],
"text": "You remember things better",
"words": [
{
"boundingBox": [
140,
223,
267,
205,
288,
324,
162,
342
],
"text": "You"
},
{
"boundingBox": [
314,
198,
740,
137,
761,
256,
335,
317
],
"text": "remember"
},
{
"boundingBox": [
761,
134,
1026,
95,
1047,
215,
782,
253
],
"text": "things"
},
{
"boundingBox": [
1046,
92,
1285,
58,
1307,
177,
1068,
212
],
"text": "better"
}
]
},
{
"boundingBox": [
155,
405,
537,
338,
557,
449,
175,
516
],
"text": "by hand",
"words": [
{
"boundingBox": [
146,
408,
266,
387,
301,
495,
181,
516
],
"text": "by"
},
{
"boundingBox": [
290,
383,
569,
334,
604,
443,
325,
491
491
],
"text": "hand"
}
]
}
]
}
}
Computer Vision Java Quick Starts
6/8/2017 • 9 min to read • Edit Online

This article provides information and code samples to help you quickly get started using Java and the Computer
Vision API to accomplish the following tasks:
Analyze an image
Use a Domain-Specific Model
Intelligently generate a thumbnail
Detect and extract printed text from an image
Detect and extract handwritten text from an image

Prerequisites
Get the Microsoft Computer Vision Android SDK here.
To use the Computer Vision API, you need a subscription key. You can get free subscription keys here.

Analyze an Image With Computer Vision API Using Java


With the Analyze Image method, you can extract visual features based on image content. You can upload an image
or specify an image URL and choose which features to return, including:
The category defined in this taxonomy.
A detailed list of tags related to the image content.
A description of image content in a complete sentence.
The coordinates, gender, and age of any faces contained in the image.
The ImageType (clip art or a line drawing).
The dominant color, the accent color, or whether an image is black & white.
Does the image contain adult or sexually suggestive content?
Analyze an Image Java Example Request
Change the REST URL to use the location where you obtained your subscription keys, and replace the "Ocp-Apim-
Subscription-Key" value with your valid subscription key.
// This sample uses the Apache HTTP client from HTTP Components (http://hc.apache.org/httpcomponents-client-ga/)
import java.net.URI;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.StringEntity;
import org.apache.http.client.utils.URIBuilder;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.util.EntityUtils;

public class Main


{
public static void main(String[] args)
{
HttpClient httpclient = new DefaultHttpClient();

try
{
// NOTE: You must use the same location in your REST call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from westus, replace "westcentralus" in the
// URL below with "westus".
URIBuilder builder = new URIBuilder("https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/analyze");

builder.setParameter("visualFeatures", "Categories");
builder.setParameter("details", "Celebrities");
builder.setParameter("language", "en");

URI uri = builder.build();


HttpPost request = new HttpPost(uri);

// Request headers.
request.setHeader("Content-Type", "application/json");

// NOTE: Replace the example key with a valid subscription key.


request.setHeader("Ocp-Apim-Subscription-Key", "13hc77781f7e4b19b5fcdd72a8df7156");

// Request body. Replace the example URL with the URL for the JPEG image of a celebrity.
StringEntity reqEntity = new StringEntity("{\"url\":\"http://example.com/images/test.jpg\"}");
request.setEntity(reqEntity);

HttpResponse response = httpclient.execute(request);


HttpEntity entity = response.getEntity();

if (entity != null)
{
System.out.println(EntityUtils.toString(entity));
}
}
catch (Exception e)
{
System.out.println(e.getMessage());
}
}
}

Analyze an Image Response


A successful response is returned in JSON. The following example shows a successful response:

{
"categories": [
{
"name": "abstract_",
"score": 0.00390625
},
{
{
"name": "people_",
"score": 0.83984375,
"detail": {
"celebrities": [
{
"name": "Satya Nadella",
"faceRectangle": {
"left": 597,
"top": 162,
"width": 248,
"height": 248
},
"confidence": 0.999028444
}
]
}
}
],
"adult": {
"isAdultContent": false,
"isRacyContent": false,
"adultScore": 0.0934349000453949,
"racyScore": 0.068613491952419281
},
"tags": [
{
"name": "person",
"confidence": 0.98979085683822632
},
{
"name": "man",
"confidence": 0.94493889808654785
},
{
"name": "outdoor",
"confidence": 0.938492476940155
},
{
"name": "window",
"confidence": 0.89513939619064331
}
],
"description": {
"tags": [
"person",
"man",
"outdoor",
"window",
"glasses"
],
"captions": [
{
"text": "Satya Nadella sitting on a bench",
"confidence": 0.48293603002174407
}
] },
"requestId": "0dbec5ad-a3d3-4f7e-96b4-dfd57efe967d",
"metadata": {
"width": 1500,
"height": 1000,
"format": "Jpeg"
},
"faces": [
{
"age": 44,
"gender": "Male",
"faceRectangle": {
"left": 593,
"top": 160,
"top": 160,
"width": 250,
"height": 250
}
}
],
"color": {
"dominantColorForeground": "Brown",
"dominantColorBackground": "Brown",
"dominantColors": [
"Brown",
"Black"
],
"accentColor": "873B59",
"isBWImg": false
},
"imageType": {
"clipArtType": 0,
"lineDrawingType": 0
}
}

Use a Domain-Specific Model


The Domain-Specific Model is a model trained to identify a specific set of objects in an image. The two domain-
specific models that are currently available are celebrities and landmarks. The following example identifies a
landmark in an image.
Landmark Java Example Request
Change the REST URL to use the location where you obtained your subscription keys, and replace the "Ocp-Apim-
Subscription-Key" value with your valid subscription key.

// This sample uses the Apache HTTP client (org.apache.httpcomponents:httpclient:4.2.4)


// and org.json (org.json:20160810).

import java.net.URI;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.StringEntity;
import org.apache.http.client.utils.URIBuilder;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.util.EntityUtils;
import org.json.JSONObject;

public class Main


{
public static void main(String[] args)
{
HttpClient httpClient = new DefaultHttpClient();

try
{
// NOTE: You must use the same location in your REST call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from westus, replace "westcentralus" in the
// URL below with "westus".
//
// Also, change "landmarks" to "celebrities" in the URL to use the Celebrities model.
URIBuilder uriBuilder = new URIBuilder("https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/models/landmarks/analyze");

// Change "landmarks" to "celebrities" to use the Celebrities model.


uriBuilder.setParameter("model", "landmarks");
uriBuilder.setParameter("model", "landmarks");

URI uri = uriBuilder.build();


HttpPost request = new HttpPost(uri);

// Request headers.
request.setHeader("Content-Type", "application/json");

// NOTE: Replace the "Ocp-Apim-Subscription-Key" value with a valid subscription key.


request.setHeader("Ocp-Apim-Subscription-Key", "13hc77781f7e4b19b5fcdd72a8df7156");

// Request body. Replace the example URL with the URL of a JPEG image containing text.
StringEntity requestEntity = new StringEntity("{\"url\":\"https://upload.wikimedia.org/wikipedia/commons/2/23/Space_Needle_2011-07-
04.jpg\"}");
request.setEntity(requestEntity);

HttpResponse response = httpClient.execute(request);


HttpEntity entity = response.getEntity();

if (entity != null)
{
// Format and output the JSON response
String jsonString = EntityUtils.toString(entity);
JSONObject json = new JSONObject(jsonString);
System.out.println("REST Response:");
System.out.println(json.toString(2));
}
}
catch (Exception e)
{
// Display error message.
System.out.println(e.getMessage());
}
}
}

Landmark Example Response


A successful response is returned in JSON. Following is an example of a successful response:

REST Response:
{
"result": {"landmarks": [{
"confidence": 0.9998178,
"name": "Space Needle"
}]},
"metadata": {
"width": 2096,
"format": "Jpeg",
"height": 4132
},
"requestId": "7d0d00da-ac37-44ba-ad77-050e11a5ee05"
}

Get a Thumbnail with Computer Vision API Using Java


Use the Get Thumbnail method to crop an image based on its region of interest (ROI) to the height and width you
desire. The aspect ratio you set for the thumbnail can be different from the aspect ratio of the input image.
Get a Thumbnail Java Example Request
Change the REST URL to use the location where you obtained your subscription keys, and replace the "Ocp-Apim-
Subscription-Key" value with your valid subscription key.

// // This sample uses the Apache HTTP client from HTTP Components (http://hc.apache.org/httpcomponents-client-ga/)
// // This sample uses the Apache HTTP client from HTTP Components (http://hc.apache.org/httpcomponents-client-ga/)
import java.awt.*;
import javax.swing.*;
import java.net.URI;
import java.io.InputStream;
import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.StringEntity;
import org.apache.http.client.utils.URIBuilder;
import org.apache.http.impl.client.DefaultHttpClient;

public class Main


{
public static void main(String[] args)
{
HttpClient httpClient = new DefaultHttpClient();

try
{
// NOTE: You must use the same location in your REST call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from westus, replace "westcentralus" in the
// URL below with "westus".
URIBuilder uriBuilder = new URIBuilder("https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/generateThumbnail");

uriBuilder.setParameter("width", "100");
uriBuilder.setParameter("height", "150");
uriBuilder.setParameter("smartCropping", "true");

URI uri = uriBuilder.build();


HttpPost request = new HttpPost(uri);

// Request headers.
request.setHeader("Content-Type", "application/json");

// NOTE: Replace the "Ocp-Apim-Subscription-Key" value with a valid subscription key.


request.setHeader("Ocp-Apim-Subscription-Key", "13hc77781f7e4b19b5fcdd72a8df7156");

// Request body. Replace the example URL with the URL for the JPEG image of a person.
StringEntity requestEntity = new StringEntity("{\"url\":\"http://example.com/images/test.jpg\"}");
request.setEntity(requestEntity);

HttpResponse response = httpClient.execute(request);


System.out.println(response);

// Display the thumbnail.


HttpEntity httpEntity = response.getEntity();
displayImage(httpEntity.getContent());
}
catch (Exception e)
{
System.out.println(e.getMessage());
}
}

private static void displayImage(InputStream inputStream)


{
try {
BufferedImage bufferedImage = ImageIO.read(inputStream);

ImageIcon imageIcon = new ImageIcon(bufferedImage);

JLabel jLabel = new JLabel();


jLabel.setIcon(imageIcon);

JFrame jFrame = new JFrame();


jFrame.setLayout(new FlowLayout());
jFrame.setLayout(new FlowLayout());
jFrame.setSize(100, 150);

jFrame.add(jLabel);
jFrame.setVisible(true);
jFrame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
}
catch (Exception e) {
System.out.println(e.getMessage());
}
}
}

Get a Thumbnail Response


A successful response contains the thumbnail image binary. If the request fails, the response contains an error code
and a message to help determine what went wrong.

Optical Character Recognition (OCR) with Computer Vision API Using


Java
Use the Optical Character Recognition (OCR) method to detect printed text in an image and extract recognized
characters into a machine-usable character stream.
OCR Java Example Request
Change the REST URL to use the location where you obtained your subscription keys, and replace the "Ocp-Apim-
Subscription-Key" value with your valid subscription key.
// This sample uses the Apache HTTP client from HTTP Components (http://hc.apache.org/httpcomponents-client-ga/)
import java.net.URI;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.StringEntity;
import org.apache.http.client.utils.URIBuilder;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.util.EntityUtils;

public class Main


{
public static void main(String[] args)
{
HttpClient httpClient = new DefaultHttpClient();

try
{
// NOTE: You must use the same location in your REST call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from westus, replace "westcentralus" in the
// URL below with "westus".
URIBuilder uriBuilder = new URIBuilder("https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/ocr");

uriBuilder.setParameter("language", "unk");
uriBuilder.setParameter("detectOrientation ", "true");

URI uri = uriBuilder.build();


HttpPost request = new HttpPost(uri);

// Request headers.
request.setHeader("Content-Type", "application/json");

// NOTE: Replace the "Ocp-Apim-Subscription-Key" value with a valid subscription key.


request.setHeader("Ocp-Apim-Subscription-Key", "13hc77781f7e4b19b5fcdd72a8df7156");

// Request body. Replace the example URL with the URL of a JPEG image containing text.
StringEntity requestEntity = new StringEntity("{\"url\":\"http://example.com/images/test.jpg\"}");
request.setEntity(requestEntity);

HttpResponse response = httpClient.execute(request);


HttpEntity entity = response.getEntity();

if (entity != null)
{
System.out.println(EntityUtils.toString(entity));
}
}
catch (Exception e)
{
System.out.println(e.getMessage());
}
}
}

OCR Example Response


Upon success, the OCR results returned include the detected text and bounding boxes for regions, lines, and words.
{
"language": "en",
"textAngle": -2.0000000000000338,
"orientation": "Up",
"regions": [
{
"boundingBox": "462,379,497,258",
"lines": [
{
"boundingBox": "462,379,497,74",
"words": [
{
"boundingBox": "462,379,41,73",
"text": "A"
},
{
"boundingBox": "523,379,153,73",
"text": "GOAL"
},
{
"boundingBox": "694,379,265,74",
"text": "WITHOUT"
}
]
},
{
"boundingBox": "565,471,289,74",
"words": [
{
"boundingBox": "565,471,41,73",
"text": "A"
},
{
"boundingBox": "626,471,150,73",
"text": "PLAN"
},
{
"boundingBox": "801,472,53,73",
"text": "IS"
}
]
},
{
"boundingBox": "519,563,375,74",
"words": [
{
"boundingBox": "519,563,149,74",
"text": "JUST"
},
{
"boundingBox": "683,564,41,72",
"text": "A"
},
{
"boundingBox": "741,564,153,73",
"text": "WISH"
}
]
}
]
}
]
}

Text Recognition with Computer Vision API Using Java


Use the RecognizeText method to detect handwritten or printed text in an image and extract recognized characters
into a machine-usable character stream.
Handwriting Recognition Java Example
Change the REST URL to use the location where you obtained your subscription keys, and replace the "Ocp-Apim-
Subscription-Key" value with your valid subscription key.

// This sample uses the Apache HTTP client from HTTP Components (http://hc.apache.org/httpcomponents-client-ga/)
import java.net.URI;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.util.EntityUtils;
import org.apache.http.Header;

public class Main


{
public static void main(String[] args)
{
HttpClient textClient = new DefaultHttpClient();
HttpClient resultClient = new DefaultHttpClient();

// NOTE: Replace this example key with your valid subscription key.
String subscriptionKey = "13hc77781f7e4b19b5fcdd72a8df7156";

try
{
// NOTE: You must use the same location in your REST call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from westus, replace "westcentralus" in the
// URL below with "westus".
//
// Also, for printed text, set "handwriting" to false.
URI uri = new URI("https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/recognizeText?handwriting=true");
HttpPost textRequest = new HttpPost(uri);

// Request headers. Another valid content type is "application/octet-stream".


textRequest.setHeader("Content-Type", "application/json");
textRequest.setHeader("Ocp-Apim-Subscription-Key", subscriptionKey);

// Request body. Replace the example URL with the URL of a JPEG image containing handwriting.
StringEntity requestEntity = new StringEntity("{\"url\":\"http://example.com/images/test.jpg\"}");
textRequest.setEntity(requestEntity);

HttpResponse textResponse = textClient.execute(textRequest);


String operationLocation = null;

Header[] responseHeaders = textResponse.getAllHeaders();


for(Header header : responseHeaders) {
if(header.getName().equals("Operation-Location"))
{
// This string is the URI where you can get the text recognition operation result.
operationLocation = header.getValue();
break;
}
}

System.out.println(operationLocation);

HttpGet resultRequest = new HttpGet(operationLocation);


resultRequest.setHeader("Ocp-Apim-Subscription-Key", subscriptionKey);

// NOTE: The response may not be immediately available. Handwriting recognition is an


// NOTE: The response may not be immediately available. Handwriting recognition is an
// async operation that can take a variable amount of time depending on the length
// of the text you want to recognize. You may need to wait or retry this operation.
HttpResponse resultResponse = resultClient.execute(resultRequest);
System.out.print("Text recognition result response: ");
System.out.println(resultResponse);
}
catch (Exception e)
{
System.out.println(e.getMessage());
}
}
}
Computer Vision JavaScript Quick Starts
5/24/2017 • 6 min to read • Edit Online

This article provides information and code samples to help you quickly get started using JavaScript and the
Computer Vision API to accomplish the following tasks:
Analyze an image
Use a Domain-Specific Model
Intelligently generate a thumbnail
Detect and extract text from an Image
Learn more about obtaining free Subscription Keys here

Analyze an Image With Computer Vision API Using JavaScript


With the Analyze Image method, you can extract visual features based on image content. You can upload an image
or specify an image URL and choose which features to return, including:
The category defined in this taxonomy.
A detailed list of tags related to the image content.
A description of image content in a complete sentence.
The coordinates, gender, and age of any faces contained in the image.
The ImageType (clipart or a line drawing)
The dominant color, the accent color, or whether an image is black & white.
Whether the image contains pornographic or sexually suggestive content.
Analyze an Image JavaScript Example Request
Copy the following and save it to a file such as analyze.html . Change the url to use the location where you obtained
your subscription keys, and replace the "Ocp-Apim-Subscription-Key" value with your valid subscription key. To run
the sample, drag-and-drop the file into your browser.
<!DOCTYPE html>
<html>
<head>
<title>Analyze an Image Sample</title>
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.9.0/jquery.min.js"></script>
</head>
<body>

<script type="text/javascript">
$(function() {
var params = {
// Request parameters
"visualFeatures": "Categories,Description,Color",
"details": "",
"language": "en",
};

$.ajax({
// NOTE: You must use the same location in your REST call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from westus, replace "westcentralus" in the
// URL below with "westus".
url: "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/analyze?" + $.param(params),

beforeSend: function(xhrObj){
// Request headers
xhrObj.setRequestHeader("Content-Type","application/json");

// NOTE: Replace the "Ocp-Apim-Subscription-Key" value with a valid subscription key.


xhrObj.setRequestHeader("Ocp-Apim-Subscription-Key", "13hc77781f7e4b19b5fcdd72a8df7156");
},

type: "POST",

// Request body
data: '{"url": "http://upload.wikimedia.org/wikipedia/commons/3/3c/Shaki_waterfall.jpg"}',
})

.done(function(data) {
// Show formatted JSON on webpage.
$("#responseTextArea").val(JSON.stringify(data, null, 2));
})

.fail(function(jqXHR, textStatus, errorThrown) {


// Display error message.
var errorString = (errorThrown === "") ? "Error. " : errorThrown + " (" + jqXHR.status + "): ";
errorString += (jqXHR.responseText === "") ? "" : jQuery.parseJSON(jqXHR.responseText).message;
alert(errorString);
});
});
</script>
REST response:
<br><br>
<textarea id="responseTextArea" class="UIInput" cols="120" rows="32"></textarea>
</body>
</html>

Analyze an Image Response


A successful response is returned in JSON. Following is an example of a successful response:
{
"categories": [
{
"name": "outdoor_water",
"score": 0.9921875
}
],
"description": {
"tags": [
"nature",
"water",
"waterfall",
"outdoor",
"rock",
"mountain",
"rocky",
"grass",
"hill",
"top",
"covered",
"hillside",
"standing",
"side",
"group",
"walking",
"white",
"man",
"large",
"snow",
"grazing",
"forest",
"slope",
"herd",
"river",
"giraffe",
"field"
],
"captions": [
{
"text": "a large waterfall over a rocky cliff",
"confidence": 0.9165147003194483
}
]
},
"requestId": "b372f8d6-4b56-43ed-9fbb-0ae2a888a1e9",
"metadata": {
"width": 1280,
"height": 959,
"format": "Jpeg"
},
"color": {
"dominantColorForeground": "Grey",
"dominantColorBackground": "Green",
"dominantColors": [
"Grey",
"Green"
],
"accentColor": "4D5E2F",
"isBWImg": false
}
}

Use a Domain-Specific Model


The Domain-Specific Model is a model trained to identify a specific set of objects in an image. The two domain-
specific models that are currently available are celebrities and landmarks. The following example identifies a
landmark in an image.
Landmark JavaScript Example Request
Copy the following and save it to a file such as landmark.html . Change the url to use the location where you obtained
your subscription keys, and replace the "Ocp-Apim-Subscription-Key" value with your valid subscription key. Then
drag-and-drop the file into your browser to run this sample.

<!DOCTYPE html>
<html>
<head>
<title>Landscape Sample</title>
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.9.0/jquery.min.js"></script>
</head>
<body>

<script type="text/javascript">
$(function() {
var params = {
// Request parameters.
"model": "landmarks", // Use "model": "celebrities" to use the Celebrities model.
};

$.ajax({
// NOTE: You must use the same location in your REST call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from westus, replace "westcentralus" in the
// URL below with "westus".
//
// Also, change "landmarks" to "celebrities" in the url to use the Celebrities model.
url: "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/models/landmarks/analyze?" + $.param(params),

beforeSend: function(xhrObj) {
// Request headers.
xhrObj.setRequestHeader("Content-Type", "application/json");

// NOTE: Replace the "Ocp-Apim-Subscription-Key" value with a valid subscription key.


xhrObj.setRequestHeader("Ocp-Apim-Subscription-Key", "13hc77781f7e4b19b5fcdd72a8df7156");
},

type: "POST",

// Request body
data: '{"url": "https://upload.wikimedia.org/wikipedia/commons/2/23/Space_Needle_2011-07-04.jpg"}',
})

.done(function(data) {
// Show formatted JSON on webpage.
$("#responseTextArea").val(JSON.stringify(data, null, 2));
})

.fail(function(jqXHR, textStatus, errorThrown) {


// Display error message.
var errorString = (errorThrown === "") ? "Error. " : errorThrown + " (" + jqXHR.status + "): ";
errorString += (jqXHR.responseText === "") ? "" : jQuery.parseJSON(jqXHR.responseText).message;
alert(errorString);
});
});
</script>
REST response:
<br><br>
<textarea id="responseTextArea" class="UIInput" cols="120" rows="32"></textarea>
</body>
</html>

Landmark Example Response


A successful response is returned in JSON. Following is an example of a successful response:

{
"requestId": "e0970003-1cb7-4ac6-b0d4-f36a1914bf4e",
"metadata": {
"width": 2096,
"height": 4132,
"format": "Jpeg"
},
"result": {
"landmarks": [
{
"name": "Space Needle",
"confidence": 0.9998178
}
]
}
}

Get a Thumbnail with Computer Vision API Using JavaScript


Use the Get Thumbnail method to crop an image based on its region of interest (ROI) to the height and width you
desire, even if the aspect ratio differs from the input image.
Get a Thumbnail JavaScript Example Request
Copy the following and save it to a file such as thumbnail.html . Change the url to use the location where you
obtained your subscription keys, replace the "Ocp-Apim-Subscription-Key" value with your valid subscription key,
and add the body. To run the sample, drag-and-drop the file into your browser.
<!DOCTYPE html>
<html>
<head>
<title>Thumbnail Sample</title>
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.9.0/jquery.min.js"></script>
</head>
<body>

<script type="text/javascript">
$(function() {
var params = {
// Request parameters
"width": "{number}", // Replace "{number}" with the desired width of your thumbnail.
"height": "{number}", // Replace "{number}" with the desired height of your thumbnail.
"smartCropping": "true",
};

$.ajax({
// NOTE: You must use the same location in your REST call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from westus, replace "westcentralus" in the
// URL below with "westus".
url: "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/generateThumbnail?" + $.param(params),
beforeSend: function(xhrObj){
// Request headers
xhrObj.setRequestHeader("Content-Type","application/json");

// Replace the "Ocp-Apim-Subscription-Key" value with a valid subscription key.


xhrObj.setRequestHeader("Ocp-Apim-Subscription-Key","13hc77781f7e4b19b5fcdd72a8df7156");
},
type: "POST",
// Request body
data: "{body}", // Replace "{body}" with the body. For example, '{"url": "http://www.example.com/images/image.jpg"}'
})
.done(function(data) {
alert("success");
})
.fail(function() {
alert("error");
});
});
</script>
</body>
</html>

Get a Thumbnail Response


A successful response contains the thumbnail image binary. If the request failed, the response contains an error
code and a message to help determine what went wrong.

Optical Character Recognition (OCR) with Computer Vision API Using


JavaScript
Use the Optical Character Recognition (OCR) method to detect text in an image and extract recognized characters
into a machine-usable character stream.
OCR JavaScript Example Request
Copy the following and save it to a file such as thumbnail.html . Change the url to use the location where you
obtained your subscription keys, replace the "Ocp-Apim-Subscription-Key" value with your valid subscription key,
and add the body. Then drag-and-drop the file into your browser to run this sample.
<!DOCTYPE html>
<html>
<head>
<title>OCR Sample</title>
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.9.0/jquery.min.js"></script>
</head>
<body>

<script type="text/javascript">
$(function() {
var params = {
// Request parameters
"language": "unk",
"detectOrientation ": "true",
};

$.ajax({
// NOTE: You must use the same location in your REST call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from westus, replace "westcentralus" in the
// URL below with "westus".
url: "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/ocr?" + $.param(params),
beforeSend: function(xhrObj){
// Request headers
xhrObj.setRequestHeader("Content-Type","application/json");

// Replace the "Ocp-Apim-Subscription-Key" value with a valid subscription key.


xhrObj.setRequestHeader("Ocp-Apim-Subscription-Key","13hc77781f7e4b19b5fcdd72a8df7156");
},
type: "POST",
// Request body
data: "{body}", // Replace with the body, for example, "{"url": "http://www.example.com/images/image.jpg"}
})
.done(function(data) {
alert("success");
})
.fail(function() {
alert("error");
});
});
</script>
</body>
</html>

OCR Example Response


Upon success, the OCR results returned include text, bounding box for regions, lines, and words.
{
"language": "en",
"textAngle": -2.0000000000000338,
"orientation": "Up",
"regions": [
{
"boundingBox": "462,379,497,258",
"lines": [
{
"boundingBox": "462,379,497,74",
"words": [
{
"boundingBox": "462,379,41,73",
"text": "A"
},
{
"boundingBox": "523,379,153,73",
"text": "GOAL"
},
{
"boundingBox": "694,379,265,74",
"text": "WITHOUT"
}
]
},
{
"boundingBox": "565,471,289,74",
"words": [
{
"boundingBox": "565,471,41,73",
"text": "A"
},
{
"boundingBox": "626,471,150,73",
"text": "PLAN"
},
{
"boundingBox": "801,472,53,73",
"text": "IS"
}
]
},
{
"boundingBox": "519,563,375,74",
"words": [
{
"boundingBox": "519,563,149,74",
"text": "JUST"
},
{
"boundingBox": "683,564,41,72",
"text": "A"
},
{
"boundingBox": "741,564,153,73",
"text": "WISH"
}
]
}
]
}
]
}
Computer Vision PHP Quick Starts
5/24/2017 • 6 min to read • Edit Online

This article provides information and code samples to help you quickly get started using the Computer Vision API
with PHP to accomplish the following tasks:
Analyze an image
Use a Domain-Specific Model
Intelligently generate a thumbnail
Detect and extract text from an Image
Learn more about obtaining free Subscription Keys here

Analyze an Image With Computer Vision API Using PHP


With the Analyze Image method, you can extract visual features based on image content. You can upload an image
or specify an image URL and choose which features to return, including:
The category defined in this taxonomy.
A detailed list of tags related to the image content.
A description of image content in a complete sentence.
The coordinates, gender, and age of any faces contained in the image.
The ImageType (clipart or a line drawing)
The dominant color, the accent color, or whether an image is black & white.
Whether the image contains pornographic or sexually suggestive content.
Analyze an Image PHP Example Request
Change the REST URL to use the location where you obtained your subscription keys, and replace the "Ocp-Apim-
Subscription-Key" value with your valid subscription key.
<?php
// This sample uses the Apache HTTP client from HTTP Components (http://hc.apache.org/httpcomponents-client-ga/)
require_once 'HTTP/Request2.php';

// NOTE: You must use the same location in your REST call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from westus, replace "westcentralus" in the
// URL below with "westus".
$request = new Http_Request2('https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/analyze');
$url = $request->getUrl();

$headers = array(
// Request headers
'Content-Type' => 'application/json',

// NOTE: Replace the "Ocp-Apim-Subscription-Key" value with a valid subscription key.


'Ocp-Apim-Subscription-Key' => '13hc77781f7e4b19b5fcdd72a8df7156',
);

$request->setHeader($headers);

$parameters = array(
// Request parameters
'visualFeatures' => 'Categories',
'details' => '{string}',
'language' => 'en',
);

$url->setQueryVariables($parameters);

$request->setMethod(HTTP_Request2::METHOD_POST);

// Request body
$request->setBody("{body}"); // Replace with the body, for example, "{"url": "http://www.example.com/images/image.jpg"}

try
{
$response = $request->send();
echo $response->getBody();
}
catch (HttpException $ex)
{
echo $ex;
}

?>

Analyze an Image Response


A successful response is returned in JSON. Following is an example of a successful response:

{
"categories": [
{
"name": "abstract_",
"score": 0.00390625
},
{
"name": "people_",
"score": 0.83984375,
"detail": {
"celebrities": [
{
"name": "Satya Nadella",
"faceRectangle": {
"left": 597,
"top": 162,
"width": 248,
"width": 248,
"height": 248
},
"confidence": 0.999028444
}
]
}
}
],
"adult": {
"isAdultContent": false,
"isRacyContent": false,
"adultScore": 0.0934349000453949,
"racyScore": 0.068613491952419281
},
"tags": [
{
"name": "person",
"confidence": 0.98979085683822632
},
{
"name": "man",
"confidence": 0.94493889808654785
},
{
"name": "outdoor",
"confidence": 0.938492476940155
},
{
"name": "window",
"confidence": 0.89513939619064331
}
],
"description": {
"tags": [
"person",
"man",
"outdoor",
"window",
"glasses"
],
"captions": [
{
"text": "Satya Nadella sitting on a bench",
"confidence": 0.48293603002174407
}
] },
"requestId": "0dbec5ad-a3d3-4f7e-96b4-dfd57efe967d",
"metadata": {
"width": 1500,
"height": 1000,
"format": "Jpeg"
},
"faces": [
{
"age": 44,
"gender": "Male",
"faceRectangle": {
"left": 593,
"top": 160,
"width": 250,
"height": 250
}
}
],
"color": {
"dominantColorForeground": "Brown",
"dominantColorBackground": "Brown",
"dominantColors": [
"Brown",
"Brown",
"Black"
],
"accentColor": "873B59",
"isBWImg": false
},
"imageType": {
"clipArtType": 0,
"lineDrawingType": 0
}
}

Use a Domain-Specific Model


The Domain-Specific Model is a model trained to identify a specific set of objects in an image. The two domain-
specific models that are currently available are celebrities and landmarks. The following example identifies a
landmark in an image.
Landmark PHP Example Request
Change the REST URL to use the location where you obtained your subscription keys, and replace the "Ocp-Apim-
Subscription-Key" value with your valid subscription key.
<html>
<head>
<title>PHP Sample</title>
</head>
<body>
<?php
// This sample uses PEAR (https://pear.php.net/package/HTTP_Request2/download)
require_once 'HTTP/Request2.php';

// NOTE: You must use the same location in your REST call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from westus, replace "westcentralus" in the
// URL below with "westus".
//
// Also, change "landmarks" to "celebrities" in the url to use the Celebrities model.
$request = new Http_Request2('https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/models/landmarks/analyze');
$url = $request->getUrl();

$headers = array(
// Request headers
'Content-Type' => 'application/json',

// NOTE: Replace the "Ocp-Apim-Subscription-Key" value with a valid subscription key.


'Ocp-Apim-Subscription-Key' => '13hc77781f7e4b19b5fcdd72a8df7156',
);

$request->setHeader($headers);

$parameters = array(
// Request parameters
'model' => 'landmarks', // Use 'model' => 'celebrities' to use the Celebrities model.
);

$url->setQueryVariables($parameters);

$request->setMethod(HTTP_Request2::METHOD_POST);

// Request body
$body = json_encode(array(
// Request body parameters
'url' => 'https://upload.wikimedia.org/wikipedia/commons/2/23/Space_Needle_2011-07-04.jpg',
));
$request->setBody($body);

try
{
$response = $request->send();
echo "<pre>" . json_encode(json_decode($response->getBody()), JSON_PRETTY_PRINT) . "</pre>";
}
catch (HttpException $ex)
{
echo "<pre>" . $ex . "</pre>";
}
?>
</body>
</html>

Landmark Example Response


A successful response is returned in JSON. Following is an example of a successful response:
{
"requestId": "0663b074-8eb3-4fab-a72e-4c31a49bd22e",
"metadata": {
"width": 2096,
"height": 4132,
"format": "Jpeg"
},
"result": {
"landmarks": [
{
"name": "Space Needle",
"confidence": 0.9998178
}
]
}
}

Get a Thumbnail with Computer Vision API Using PHP


Use the Get Thumbnail method to crop an image based on its region of interest (ROI) to the height and width you
desire, even if the aspect ratio differs from the input image.
Get a Thumbnail PHP Example Request
Change the REST URL to use the location where you obtained your subscription keys, and replace the "Ocp-Apim-
Subscription-Key" value with your valid subscription key.
<?php
// This sample uses the Apache HTTP client from HTTP Components (http://hc.apache.org/httpcomponents-client-ga/)
require_once 'HTTP/Request2.php';

// NOTE: You must use the same location in your REST call as you used to obtain your subscription keys.
// For example, if you obtained your subscription keys from westus, replace "westcentralus" in the
// URL below with "westus".
$request = new Http_Request2('https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/generateThumbnail');
$url = $request->getUrl();

$headers = array(
// Request headers
'Content-Type' => 'application/json',

// NOTE: Replace the "Ocp-Apim-Subscription-Key" value with a valid subscription key.


'Ocp-Apim-Subscription-Key' => '13hc77781f7e4b19b5fcdd72a8df7156',
);

$request->setHeader($headers);

$parameters = array(
// Request parameters
'width' => '{number}', // Replace "{number}" with the desired width of your thumbnail.
'height' => '{number}', // Replace "{number}" with the desired height of your thumbnail.
'smartCropping' => 'true',
);

$url->setQueryVariables($parameters);

$request->setMethod(HTTP_Request2::METHOD_POST);

// Request body
$request->setBody("{body}"); // Replace "{body}" with the body. For example, '{"url": "http://www.example.com/images/image.jpg"}'

try
{
$response = $request->send();
echo $response->getBody();
}
catch (HttpException $ex)
{
echo $ex;
}

?>

Get a Thumbnail Response


A successful response contains the thumbnail image binary. If the request failed, the response contains an error
code and a message to help determine what went wrong.

Optical Character Recognition (OCR) with Computer Vision API Using


PHP
Use the Optical Character Recognition (OCR) method to detect text in an image and extract recognized characters
into a machine-usable character stream.
OCR PHP Example Request
Change the REST URL to use the location where you obtained your subscription keys, and replace the "Ocp-Apim-
Subscription-Key" value with your valid subscription key.
<?php
// This sample uses the Apache HTTP client from HTTP Components (http://hc.apache.org/httpcomponents-client-ga/)
require_once 'HTTP/Request2.php';

$request = new Http_Request2('https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/ocr');


$url = $request->getUrl();

$headers = array(
// Request headers
'Content-Type' => 'application/json',

// NOTE: Replace the "Ocp-Apim-Subscription-Key" value with a valid subscription key.


'Ocp-Apim-Subscription-Key' => '13hc77781f7e4b19b5fcdd72a8df7156',
);

$request->setHeader($headers);

$parameters = array(
// Request parameters
'language' => 'unk',
'detectOrientation ' => 'true',
);

$url->setQueryVariables($parameters);

$request->setMethod(HTTP_Request2::METHOD_POST);

// Request body
$request->setBody("{body}"); // Replace "{body}" with the body. For example, '{"url": "http://www.example.com/images/image.jpg"}'

try
{
$response = $request->send();
echo $response->getBody();
}
catch (HttpException $ex)
{
echo $ex;
}

?>

OCR Example Response


Upon success, the OCR results returned include text, bounding box for regions, lines, and words.
{
"language": "en",
"textAngle": -2.0000000000000338,
"orientation": "Up",
"regions": [
{
"boundingBox": "462,379,497,258",
"lines": [
{
"boundingBox": "462,379,497,74",
"words": [
{
"boundingBox": "462,379,41,73",
"text": "A"
},
{
"boundingBox": "523,379,153,73",
"text": "GOAL"
},
{
"boundingBox": "694,379,265,74",
"text": "WITHOUT"
}
]
},
{
"boundingBox": "565,471,289,74",
"words": [
{
"boundingBox": "565,471,41,73",
"text": "A"
},
{
"boundingBox": "626,471,150,73",
"text": "PLAN"
},
{
"boundingBox": "801,472,53,73",
"text": "IS"
}
]
},
{
"boundingBox": "519,563,375,74",
"words": [
{
"boundingBox": "519,563,149,74",
"text": "JUST"
},
{
"boundingBox": "683,564,41,72",
"text": "A"
},
{
"boundingBox": "741,564,153,73",
"text": "WISH"
}
]
}
]
}
]
}
Computer Vision Python Quick Starts
6/12/2017 • 19 min to read • Edit Online

This article provides information and code samples to help you quickly get started using the Computer Vision API
with Python to accomplish the following tasks:
Analyze an image
Use a Domain-Specific Model
Intelligently generate a thumbnail
Detect and extract printed text from an image
Detect and extract handwritten text from an image
To use the Computer Vision API, you need a subscription key. You can get free subscription keys here.

Analyze an Image With Computer Vision API Using Python


With the Analyze Image method, you can extract visual features based on image content. You can upload an image
or specify an image URL and choose which features to return, including:
A detailed list of tags related to the image content.
A description of image content in a complete sentence.
The coordinates, gender, and age of any faces contained in the image.
The ImageType (clip art or a line drawing).
The dominant color, the accent color, or whether an image is black & white.
The category defined in this taxonomy.
Does the image contain adult or sexually suggestive content?
Analyze an Image Python Example Request
Copy the appropriate section for your version of Python and save it to a file such as analyze.py . Replace the
subscription_key value with your valid subscription key, and change the uri_base to use the location where you
obtained your subscription keys. Then execute the script.

########### Python 2.7 #############


import httplib, urllib, base64, json

###############################################
#### Update or verify the following values. ###
###############################################

# Replace the subscription_key string value with your valid subscription key.
subscription_key = '13hc77781f7e4b19b5fcdd72a8df7156'

# Replace or verify the region.


#
# You must use the same region in your REST API call as you used to obtain your subscription keys.
# For example, if you obtained your subscription keys from the westus region, replace
# "westcentralus" in the URI below with "westus".
#
# NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
# a free trial subscription key, you should not need to change this region.
uri_base = 'westcentralus.api.cognitive.microsoft.com'

headers = {
# Request headers.
'Content-Type': 'application/json',
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': subscription_key,
}

params = urllib.urlencode({
# Request parameters. All of them are optional.
'visualFeatures': 'Categories,Description,Color',
'language': 'en',
})

# The URL of a JPEG image to analyze.


body = "{'url':'https://upload.wikimedia.org/wikipedia/commons/1/12/Broadway_and_Times_Square_by_night.jpg'}"

try:
# Execute the REST API call and get the response.
conn = httplib.HTTPSConnection('westcentralus.api.cognitive.microsoft.com')
conn.request("POST", "/vision/v1.0/analyze?%s" % params, body, headers)
response = conn.getresponse()
data = response.read()

# 'data' contains the JSON data. The following formats the JSON data for display.
parsed = json.loads(data)
print ("Response:")
print (json.dumps(parsed, sort_keys=True, indent=2))
conn.close()

except Exception as e:
print('Error:')
print(e)

####################################

########### Python 3.6 #############


import http.client, urllib.request, urllib.parse, urllib.error, base64, json

###############################################
#### Update or verify the following values. ###
###############################################

# Replace the subscription_key string value with your valid subscription key.
subscription_key = '13hc77781f7e4b19b5fcdd72a8df7156'

# Replace or verify the region.


#
# You must use the same region in your REST API call as you used to obtain your subscription keys.
# For example, if you obtained your subscription keys from the westus region, replace
# "westcentralus" in the URI below with "westus".
#
# NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
# a free trial subscription key, you should not need to change this region.
uri_base = 'westcentralus.api.cognitive.microsoft.com'

headers = {
# Request headers.
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': subscription_key,
}

params = urllib.parse.urlencode({
# Request parameters. All of them are optional.
'visualFeatures': 'Categories,Description,Color',
'language': 'en',
})

# Replace the three dots below with the URL of a JPEG image of a celebrity.
body = "{'url':'https://upload.wikimedia.org/wikipedia/commons/1/12/Broadway_and_Times_Square_by_night.jpg'}"

try:
# Execute the REST API call and get the response.
conn = http.client.HTTPSConnection('westcentralus.api.cognitive.microsoft.com')
conn.request("POST", "/vision/v1.0/analyze?%s" % params, body, headers)
response = conn.getresponse()
data = response.read()

# 'data' contains the JSON data. The following formats the JSON data for display.
parsed = json.loads(data)
print ("Response:")
print (json.dumps(parsed, sort_keys=True, indent=2))
conn.close()

except Exception as e:
print('Error:')
print(e)

####################################

Analyze an Image Response


A successful response is returned in JSON. Following is an example of a successful response:
Response:
{
"categories": [
{
"name": "outdoor_street",
"score": 0.625
}
],
"color": {
"accentColor": "B74314",
"dominantColorBackground": "Brown",
"dominantColorForeground": "Brown",
"dominantColors": [
"Brown"
],
"isBWImg": false
},
"description": {
"captions": [
{
"confidence": 0.8241403656347864,
"text": "a group of people on a city street filled with traffic at night"
}
],
"tags": [
"outdoor",
"building",
"street",
"city",
"busy",
"people",
"filled",
"traffic",
"many",
"table",
"car",
"group",
"walking",
"bunch",
"crowded",
"large",
"night",
"light",
"standing",
"man",
"tall",
"umbrella",
"riding",
"sign",
"crowd"
]
},
"metadata": {
"format": "Jpeg",
"height": 2436,
"width": 1826
},
"requestId": "92525ac4-fda0-47c3-876c-6457851fdb08"
}

Use a Domain-Specific Model


The Domain-Specific Model is a model trained to identify a specific set of objects in an image. The two domain-
specific models that are currently available are celebrities and landmarks. The following example identifies a
landmark in an image.
Landmark Python Example Request
Copy the appropriate section for your version of Python and save it to a file such as landmark.py . Replace the
subscription_key value with your valid subscription key, and change the uri_base to use the location where you
obtained your subscription keys. Then execute the script.

########### Python 2.7 #############


import httplib, urllib, base64, json

###############################################
#### Update or verify the following values. ###
###############################################

# Replace the subscription_key string value with your valid subscription key.
subscription_key = '13hc77781f7e4b19b5fcdd72a8df7156'

# Replace or verify the region.


#
# You must use the same region in your REST API call as you used to obtain your subscription keys.
# For example, if you obtained your subscription keys from the westus region, replace
# "westcentralus" in the URI below with "westus".
#
# NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
# a free trial subscription key, you should not need to change this region.
uri_base = 'westcentralus.api.cognitive.microsoft.com'

headers = {
# Request headers.
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': subscription_key,
}

params = urllib.urlencode({
# Request parameters. Use 'model': 'celebrities' to use the Celebrities model.
'model': 'landmarks',
})

# The URL of a JPEG image containing a landmark.


body = "{'url':'https://upload.wikimedia.org/wikipedia/commons/2/23/Space_Needle_2011-07-04.jpg'}"

try:
# Execute the REST API call and get the response.
conn = httplib.HTTPSConnection(uri_base)

# Change "landmarks" to "celebrities" in the url to use the Celebrities model.


conn.request("POST", "/vision/v1.0/models/landmarks/analyze?%s" % params, body, headers)
response = conn.getresponse()
data = response.read()

# 'data' contains the JSON data. The following formats the JSON data for display.
parsed = json.loads(data)
print ("Response:")
print (json.dumps(parsed, sort_keys=True, indent=2))
conn.close()

except Exception as e:
print('Error:')
print(e)

####################################

########### Python 3.6 #############


import http.client, urllib.request, urllib.parse, urllib.error, base64, json

###############################################
#### Update or verify the following values. ###
###############################################
# Replace the subscription_key string value with your valid subscription key.
subscription_key = '13hc77781f7e4b19b5fcdd72a8df7156'

# Replace or verify the region.


#
# You must use the same region in your REST API call as you used to obtain your subscription keys.
# For example, if you obtained your subscription keys from the westus region, replace
# "westcentralus" in the URI below with "westus".
#
# NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
# a free trial subscription key, you should not need to change this region.
uri_base = 'westcentralus.api.cognitive.microsoft.com'

headers = {
# Request headers.
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': subscription_key,
}

params = urllib.parse.urlencode({
# Request parameters. Use "model": "celebrities" to use the Celebrity model.
'model': 'landmarks',
})

# The URL of a JEPG image containing text.


body = "{'url':'https://upload.wikimedia.org/wikipedia/commons/2/23/Space_Needle_2011-07-04.jpg'}"

try:
# Execute the REST API call and get the response.
conn = http.client.HTTPSConnection(uri_base)
conn.request("POST", "/vision/v1.0/models/landmarks/analyze?%s" % params, body, headers)
response = conn.getresponse()
data = response.read()

# 'data' contains the JSON data. The following formats the JSON data for display.
parsed = json.loads(data)
print ("Response:")
print (json.dumps(parsed, sort_keys=True, indent=2))
conn.close()

except Exception as e:
print('Error:')
print(e)

####################################

Landmark Example Response


A successful response is returned in JSON. Following is an example of a successful response:
{
"metadata": {
"format": "Jpeg",
"height": 4132,
"width": 2096
},
"requestId": "d08a914a-0fbb-4695-9a2e-c93791865436",
"result": {
"landmarks": [
{
"confidence": 0.9998178,
"name": "Space Needle"
}
]
}
}

Get a Thumbnail with Computer Vision API Using Python


Use the Get Thumbnail method to crop an image based on its region of interest (ROI) to the height and width you
desire. The aspect ratio you set for the thumbnail can be different from the aspect ratio of the input image.
Get a Thumbnail Python Example Request
Copy the appropriate section for your version of Python and save it to a file such as thumbnail.py . Replace the
subscription_key value with your valid subscription key, and change the uri_base to use the location where you
obtained your subscription keys. Then execute the script.

########### Python 2.7 #############


import httplib, urllib, base64, json

###############################################
#### Update or verify the following values. ###
###############################################

# Replace the subscription_key string value with your valid subscription key.
subscription_key = '13hc77781f7e4b19b5fcdd72a8df7156'

# Replace or verify the region.


#
# You must use the same region in your REST API call as you used to obtain your subscription keys.
# For example, if you obtained your subscription keys from the westus region, replace
# "westcentralus" in the URI below with "westus".
#
# NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
# a free trial subscription key, you should not need to change this region.
uri_base = 'westcentralus.api.cognitive.microsoft.com'

headers = {
# Request headers.
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': subscription_key,
}

params = urllib.urlencode({
# Request parameters. The smartCropping flag is optional.
'width': '150',
'height': '100',
'smartCropping': 'true',
})

# The URL of a JPEG image to use to create a thumbnail image.


body = "{'url':'https://upload.wikimedia.org/wikipedia/commons/9/94/Bloodhound_Puppy.jpg'}"

try:
try:
# Execute the REST API call and get the response.
conn = httplib.HTTPSConnection('westcentralus.api.cognitive.microsoft.com')
conn.request("POST", "/vision/v1.0/generateThumbnail?%s" % params, body, headers)
response = conn.getresponse()

# Check for success.


if response.status == 200:
# Success. Use 'response.read()' to return the image data.
# Display the response headers.
print ('Success.')
print ('Response headers:')
headers = response.getheaders()
for field, value in headers:
print (' ' + field + ': ' + value)
else:
# Error. 'data' contains the JSON error data. Display the error data.
data = response.read()
parsed = json.loads(data)
print ('Error:')
print (json.dumps(parsed, sort_keys=True, indent=2))

conn.close()

except Exception as e:
print('Error:')
print(e)

####################################

########### Python 3.6 #############


import http.client, urllib.request, urllib.parse, urllib.error, base64, json

###############################################
#### Update or verify the following values. ###
###############################################

# Replace the subscription_key string value with your valid subscription key.
subscription_key = '13hc77781f7e4b19b5fcdd72a8df7156'

# Replace or verify the region.


#
# You must use the same region in your REST API call as you used to obtain your subscription keys.
# For example, if you obtained your subscription keys from the westus region, replace
# "westcentralus" in the URI below with "westus".
#
# NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
# a free trial subscription key, you should not need to change this region.
uri_base = 'westcentralus.api.cognitive.microsoft.com'

headers = {
# Request headers.
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': subscription_key,
}

params = urllib.parse.urlencode({
# Request parameters. The smartCropping flag is optional.
'width': '150',
'height': '100',
'smartCropping': 'true',
})

# Replace the three dots below with the URL of the JPEG image for which you want a thumbnail.
body = "{'url':'https://upload.wikimedia.org/wikipedia/commons/9/94/Bloodhound_Puppy.jpg'}"

try:
# Execute the REST API call and get the response.
conn = http.client.HTTPSConnection('westcentralus.api.cognitive.microsoft.com')
conn.request("POST", "/vision/v1.0/generateThumbnail?%s" % params, body, headers)
response = conn.getresponse()

# Check for success.


if response.status == 200:
# Success. Use 'response.read()' to return the image data.
# Display the response headers.
print ('Success.')
print ('Response headers:')
headers = response.getheaders()
for field, value in headers:
print (' ' + field + ': ' + value)
else:
# Error. 'data' contains the JSON error data. Display the error data.
data = response.read()
parsed = json.loads(data)
print ('Error:')
print (json.dumps(parsed, sort_keys=True, indent=2))

conn.close()

except Exception as e:
print('Error:')
print(e)

####################################

Get a Thumbnail Response


A successful response contains the thumbnail image binary. If the request fails, the response contains an error code
and a message to help determine what went wrong. Following is an example of a successful response:

Success.
Response headers:
Cache-Control: no-cache
Pragma: no-cache
Content-Length: 4025
Content-Type: image/jpeg
Expires: -1
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
apim-request-id: e2533391-44a9-4265-a681-91b07aaab6dd
Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
x-content-type-options: nosniff
Date: Thu, 08 Jun 2017 22:51:16 GMT

Optical Character Recognition (OCR) with Computer Vision API Using


Python
Use the Optical Character Recognition (OCR) method to detect text in an image and extract recognized characters
into a machine-usable character stream.
OCR Python Example Request
Copy the appropriate section for your version of Python and save it to a file such as ocr.py . Replace the
subscription_key value with your valid subscription key, and change the uri_base to use the location where you
obtained your subscription keys. Then execute the script.

########### Python 2.7 #############


import httplib, urllib, base64, json

###############################################
#### Update or verify the following values. ###
###############################################

# Replace the subscription_key string value with your valid subscription key.
subscription_key = '13hc77781f7e4b19b5fcdd72a8df7156'

# Replace or verify the region.


#
# You must use the same region in your REST API call as you used to obtain your subscription keys.
# For example, if you obtained your subscription keys from the westus region, replace
# "westcentralus" in the URI below with "westus".
#
# NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
# a free trial subscription key, you should not need to change this region.
uri_base = 'westcentralus.api.cognitive.microsoft.com'

headers = {
# Request headers.
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': subscription_key,
}

params = urllib.urlencode({
# Request parameters. The language setting "unk" means automatically detect the language.
'language': 'unk',
'detectOrientation ': 'true',
})

# The URL of a JPEG image containing text.


body = "{'url':'https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/Atomist_quote_from_Democritus.png/338px-
Atomist_quote_from_Democritus.png'}"

try:
# Execute the REST API call and get the response.
conn = httplib.HTTPSConnection('westcentralus.api.cognitive.microsoft.com')
conn.request("POST", "/vision/v1.0/ocr?%s" % params, body, headers)
response = conn.getresponse()
data = response.read()

# 'data' contains the JSON data. The following formats the JSON data for display.
parsed = json.loads(data)
print ("Response:")
print (json.dumps(parsed, sort_keys=True, indent=2))
conn.close()

except Exception as e:
print('Error:')
print(e)

####################################

########### Python 3.6 #############


import http.client, urllib.request, urllib.parse, urllib.error, base64, json

###############################################
#### Update or verify the following values. ###
###############################################

# Replace the subscription_key string value with your valid subscription key.
subscription_key = '13hc77781f7e4b19b5fcdd72a8df7156'

# Replace or verify the region.


#
# You must use the same region in your REST API call as you used to obtain your subscription keys.
# For example, if you obtained your subscription keys from the westus region, replace
# "westcentralus" in the URI below with "westus".
#
# NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
# a free trial subscription key, you should not need to change this region.
uri_base = 'westcentralus.api.cognitive.microsoft.com'
uri_base = 'westcentralus.api.cognitive.microsoft.com'

headers = {
# Request headers.
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': subscription_key,
}

params = urllib.parse.urlencode({
# Request parameters. The language setting "unk" means automatically detect the language.
'language': 'unk',
'detectOrientation ': 'true',
})

# The URL of a JPEG image containing text.


body = "{'url':'https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/Atomist_quote_from_Democritus.png/338px-
Atomist_quote_from_Democritus.png'}"

try:
# Execute the REST API call and get the response.
conn = http.client.HTTPSConnection('westcentralus.api.cognitive.microsoft.com')
conn.request("POST", "/vision/v1.0/ocr?%s" % params, body, headers)
response = conn.getresponse()
data = response.read()

# 'data' contains the JSON data. The following formats the JSON data for display.
parsed = json.loads(data)
print ("Response:")
print (json.dumps(parsed, sort_keys=True, indent=2))
conn.close()

except Exception as e:
print('Error:')
print(e)

####################################

OCR Example Response


Upon success, the OCR results include the text from the image. They also include bounding boxes for regions, lines,
and words. Following is an example of a successful response:

Response:
{
"language": "en",
"orientation": "Up",
"regions": [
{
"boundingBox": "21,16,304,451",
"lines": [
{
"boundingBox": "28,16,288,41",
"words": [
{
"boundingBox": "28,16,288,41",
"text": "NOTHING"
}
]
},
{
"boundingBox": "27,66,283,52",
"words": [
{
"boundingBox": "27,66,283,52",
"text": "EXISTS"
}
]
},
},
{
"boundingBox": "27,128,292,49",
"words": [
{
"boundingBox": "27,128,292,49",
"text": "EXCEPT"
}
]
},
{
"boundingBox": "24,188,292,54",
"words": [
{
"boundingBox": "24,188,292,54",
"text": "ATOMS"
}
]
},
{
"boundingBox": "22,253,297,32",
"words": [
{
"boundingBox": "22,253,105,32",
"text": "AND"
},
{
"boundingBox": "144,253,175,32",
"text": "EMPTY"
}
]
},
{
"boundingBox": "21,298,304,60",
"words": [
{
"boundingBox": "21,298,304,60",
"text": "SPACE."
}
]
},
{
"boundingBox": "26,387,294,37",
"words": [
{
"boundingBox": "26,387,210,37",
"text": "Everything"
},
{
"boundingBox": "249,389,71,27",
"text": "else"
}
]
},
{
"boundingBox": "127,431,198,36",
"words": [
{
"boundingBox": "127,431,31,29",
"text": "is"
},
{
"boundingBox": "172,431,153,36",
"text": "opinion."
}
]
}
]
}
],
],
"textAngle": 0.0
}

Text recognition with Computer Vision API Using Python


Use the RecognizeText method to detect handwritten or printed text in an image and extract recognized characters
into a machine-usable character stream.
Handwriting Recognition Python Example
Copy the appropriate section for your version of Python and save it to a file such as handwriting.py . Replace the
subscription_key value with your valid subscription key, and change the uri_base to use the location where you
obtained your subscription keys. Then execute the script.

########### Python 2.7 #############


import httplib, urllib, base64, time, json

###############################################
#### Update or verify the following values. ###
###############################################

# Replace the subscription_key string value with your valid subscription key.
subscription_key = '13hc77781f7e4b19b5fcdd72a8df7156'

# Replace or verify the region.


#
# You must use the same region in your REST API call as you used to obtain your subscription keys.
# For example, if you obtained your subscription keys from the westus region, replace
# "westcentralus" in the URI below with "westus".
#
# NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
# a free trial subscription key, you should not need to change this region.
uri_base = 'westcentralus.api.cognitive.microsoft.com'

headers = {
# Request headers.
# Another valid content type is "application/octet-stream".
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': subscription_key,
}

# The URL of a JPEG image containing handwritten text.


body = "{'url':'https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Cursive_Writing_on_Notebook_paper.jpg/800px-
Cursive_Writing_on_Notebook_paper.jpg'}"

# For printed text, set "handwriting" to false.


params = urllib.urlencode({'handwriting' : 'true'})

try:
# This operation requrires two REST API calls. One to submit the image for processing,
# the other to retrieve the text found in the image.
#
# This executes the first REST API call and gets the response.
conn = httplib.HTTPSConnection(uri_base)
conn.request("POST", "/vision/v1.0/RecognizeText?%s" % params, body, headers)
response = conn.getresponse()

# Success is indicated by a status of 202.


if response.status != 202:
# Display JSON data and exit if the first REST API call was not successful.
parsed = json.loads(response.read())
print ("Error:")
print (json.dumps(parsed, sort_keys=True, indent=2))
conn.close()
exit()
# The 'Operation-Location' in the response contains the URI to retrieve the recognized text.
operationLocation = response.getheader('Operation-Location')
parsedLocation = operationLocation.split(uri_base)
answerURL = parsedLocation[1]

# NOTE: The response may not be immediately available. Handwriting recognition is an


# async operation that can take a variable amount of time depending on the length
# of the text you want to recognize. You may need to wait or retry this GET operation.

print('\nHandwritten text submitted. Waiting 10 seconds to retrieve the recognized text.\n')


time.sleep(10)

# Execute the second REST API call and get the response.
conn = httplib.HTTPSConnection(uri_base)
conn.request("GET", answerURL, '', headers)
response = conn.getresponse()
data = response.read()

# 'data' contains the JSON data. The following formats the JSON data for display.
parsed = json.loads(data)
print ("Response:")
print (json.dumps(parsed, sort_keys=True, indent=2))
conn.close()

except Exception as e:
print('Error:')
print(e)

####################################

########### Python 3.6 #############


import http.client, urllib.request, urllib.parse, urllib.error, base64, requests, time, json

###############################################
#### Update or verify the following values. ###
###############################################

# Replace the subscription_key string value with your valid subscription key.
subscription_key = '13hc77781f7e4b19b5fcdd72a8df7156'

# Replace or verify the region.


#
# You must use the same region in your REST API call as you used to obtain your subscription keys.
# For example, if you obtained your subscription keys from the westus region, replace
# "westcentralus" in the URI below with "westus".
#
# NOTE: Free trial subscription keys are generated in the westcentralus region, so if you are using
# a free trial subscription key, you should not need to change this region.
uri_base = 'https://westcentralus.api.cognitive.microsoft.com'

requestHeaders = {
# Request headers.
# Another valid content type is "application/octet-stream".
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': subscription_key,
}

# The URL of a JPEG image containing handwritten text.


body = {'url' : 'https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Cursive_Writing_on_Notebook_paper.jpg/800px-
Cursive_Writing_on_Notebook_paper.jpg'}

# For printed text, set "handwriting" to false.


params = {'handwriting' : 'true'}

try:
# This operation requrires two REST API calls. One to submit the image for processing,
# the other to retrieve the text found in the image.
#
#
# This executes the first REST API call and gets the response.
response = requests.request('POST', uri_base + '/vision/v1.0/RecognizeText', json=body, data=None, headers=requestHeaders, params=params)

# Success is indicated by a status of 202.


if response.status_code != 202:
# if the first REST API call was not successful, display JSON data and exit.
parsed = json.loads(response.text)
print ("Error:")
print (json.dumps(parsed, sort_keys=True, indent=2))
exit()

# The 'Operation-Location' in the response contains the URI to retrieve the recognized text.
operationLocation = response.headers['Operation-Location']

# Note: The response may not be immediately available. Handwriting recognition is an


# async operation that can take a variable amount of time depending on the length
# of the text you want to recognize. You may need to wait or retry this GET operation.

print('\nHandwritten text submitted. Waiting 10 seconds to retrieve the recognized text.\n')


time.sleep(10)

# Execute the second REST API call and get the response.
response = requests.request('GET', operationLocation, json=None, data=None, headers=requestHeaders, params=None)

# 'data' contains the JSON data. The following formats the JSON data for display.
parsed = json.loads(response.text)
print ("Response:")
print (json.dumps(parsed, sort_keys=True, indent=2))

except Exception as e:
print('Error:')
print(e)

####################################

A successful response is returned in JSON. Following is an example of a successful response:

Response:
{
"recognitionResult": {
"lines": [
{
"boundingBox": [
2,
84,
783,
96,
782,
154,
1,
148
],
"text": "Pack my box with five dozen liquor jugs",
"words": [
{
"boundingBox": [
6,
86,
92,
87,
71,
151,
0,
150
],
"text": "Pack"
},
},
{
"boundingBox": [
86,
87,
172,
88,
150,
152,
64,
151
],
"text": "my"
},
{
"boundingBox": [
165,
88,
241,
89,
219,
152,
144,
152
],
"text": "box"
},
{
"boundingBox": [
234,
89,
343,
90,
322,
154,
213,
152
],
"text": "with"
},
{
"boundingBox": [
347,
90,
432,
91,
411,
154,
325,
154
],
"text": "five"
},
{
"boundingBox": [
432,
91,
538,
92,
516,
154,
411,
154
],
"text": "dozen"
},
{
"boundingBox": [
554,
92,
92,
696,
94,
675,
154,
533,
154
],
"text": "liquor"
},
{
"boundingBox": [
710,
94,
800,
96,
800,
154,
688,
154
],
"text": "jugs"
}
]
},
{
"boundingBox": [
2,
52,
65,
46,
69,
89,
7,
95
],
"text": "dog",
"words": [
{
"boundingBox": [
0,
62,
79,
39,
94,
82,
0,
105
],
"text": "dog"
}
]
},
{
"boundingBox": [
6,
2,
771,
13,
770,
75,
5,
64
],
"text": "The quick brown fox jumps over the lazy",
"words": [
{
"boundingBox": [
8,
4,
4,
92,
5,
77,
71,
0,
71
],
"text": "The"
},
{
"boundingBox": [
89,
5,
188,
5,
173,
72,
74,
71
],
"text": "quick"
},
{
"boundingBox": [
188,
5,
323,
6,
308,
73,
173,
72
],
"text": "brown"
},
{
"boundingBox": [
316,
6,
386,
6,
371,
73,
302,
73
],
"text": "fox"
},
{
"boundingBox": [
396,
7,
508,
7,
493,
74,
381,
73
],
"text": "jumps"
},
{
"boundingBox": [
501,
7,
604,
8,
589,
75,
75,
487,
74
],
"text": "over"
},
{
"boundingBox": [
600,
8,
673,
8,
658,
75,
586,
75
],
"text": "the"
},
{
"boundingBox": [
670,
8,
800,
9,
787,
76,
655,
75
],
"text": "lazy"
}
]
}
]
},
"status": "Succeeded"
}
Computer Vision Ruby Quick Starts
5/24/2017 • 4 min to read • Edit Online

This article provides information and code samples to help you quickly get started using the Computer Vision API
with Ruby to accomplish the following tasks:
Analyze an image
Intelligently generate a thumbnail
Detect and extract text from an Image
Learn more about obtaining free Subscription Keys here

Analyze an Image With Computer Vision API Using Ruby


With the Analyze Image method, you can extract visual features based on image content. You can upload an image
or specify an image URL and choose which features to return, including:
The category defined in this taxonomy.
A detailed list of tags related to the image content.
A description of image content in a complete sentence.
The coordinates, gender, and age of any faces contained in the image.
The ImageType (clipart or a line drawing)
The dominant color, the accent color, or whether an image is black & white.
Whether the image contains pornographic or sexually suggestive content.
Analyze an Image Ruby Example Request
Change the REST URL to use the location where you obtained your subscription keys, replace the "Ocp-Apim-
Subscription-Key" value with your valid subscription key, and add a URL to a photograph of a celebrity to the body
variable.
require 'net/http'

# NOTE: You must use the same location in your REST call as you used to obtain your subscription keys.
# For example, if you obtained your subscription keys from westus, replace "westcentralus" in the
# URL below with "westus".
uri = URI('https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/analyze')
uri.query = URI.encode_www_form({
# Request parameters
'visualFeatures' => 'Categories',
'details' => '{string}',
'language' => 'en'
})

request = Net::HTTP::Post.new(uri.request_uri)
# Request headers
request['Content-Type'] = 'application/json'
# NOTE: Replace the "Ocp-Apim-Subscription-Key" value with a valid subscription key.
request['Ocp-Apim-Subscription-Key'] = '{subscription key}'
# Replace with the body, for example, "{\"url\": \"http://www.example.com/images/image.jpg\"}"
request.body = "{body}"

response = Net::HTTP.start(uri.host, uri.port, :use_ssl => uri.scheme == 'https') do |http|


http.request(request)
end

puts response.body

Analyze an Image Response


A successful response is returned in JSON. Following is an example of a successful response:

{
"categories": [
{
"name": "abstract_",
"score": 0.00390625
},
{
"name": "people_",
"score": 0.83984375,
"detail": {
"celebrities": [
{
"name": "Satya Nadella",
"faceRectangle": {
"left": 597,
"top": 162,
"width": 248,
"height": 248
},
"confidence": 0.999028444
}
]
}
}
],
"adult": {
"isAdultContent": false,
"isRacyContent": false,
"adultScore": 0.0934349000453949,
"racyScore": 0.068613491952419281
},
"tags": [
{
"name": "person",
"confidence": 0.98979085683822632
},
},
{
"name": "man",
"confidence": 0.94493889808654785
},
{
"name": "outdoor",
"confidence": 0.938492476940155
},
{
"name": "window",
"confidence": 0.89513939619064331
}
],
"description": {
"tags": [
"person",
"man",
"outdoor",
"window",
"glasses"
],
"captions": [
{
"text": "Satya Nadella sitting on a bench",
"confidence": 0.48293603002174407
}
]
},
"requestId": "0dbec5ad-a3d3-4f7e-96b4-dfd57efe967d",
"metadata": {
"width": 1500,
"height": 1000,
"format": "Jpeg"
},
"faces": [
{
"age": 44,
"gender": "Male",
"faceRectangle": {
"left": 593,
"top": 160,
"width": 250,
"height": 250
}
}
],
"color": {
"dominantColorForeground": "Brown",
"dominantColorBackground": "Brown",
"dominantColors": [
"Brown",
"Black"
],
"accentColor": "873B59",
"isBWImg": false
},
"imageType": {
"clipArtType": 0,
"lineDrawingType": 0
}
}

Get a Thumbnail with Computer Vision API Using Ruby


Use the Get Thumbnail method to crop an image based on its region of interest (ROI) to the height and width you
desire, even if the aspect ratio differs from the input image.
Get a Thumbnail Ruby Example Request
Change the REST URL to use the location where you obtained your subscription keys, replace the "Ocp-Apim-
Subscription-Key" value with your valid subscription key, and add a URL to a photograph of a celebrity to the body
variable.

require 'net/http'

# NOTE: You must use the same location in your REST call as you used to obtain your subscription keys.
# For example, if you obtained your subscription keys from westus, replace "westcentralus" in the
# URL below with "westus".
uri = URI('https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/generateThumbnail')
uri.query = URI.encode_www_form({
# Request parameters
'width' => '{number}',
'height' => '{number}',
'smartCropping' => 'true'
})

request = Net::HTTP::Post.new(uri.request_uri)
# Request headers
request['Content-Type'] = 'application/json'
# NOTE: Replace the "Ocp-Apim-Subscription-Key" value with a valid subscription key.
request['Ocp-Apim-Subscription-Key'] = '{subscription key}'
# Replace with the body, for example, "{\"url\": \"http://www.example.com/images/image.jpg\"}"
request.body = "{body}"

response = Net::HTTP.start(uri.host, uri.port, :use_ssl => uri.scheme == 'https') do |http|


http.request(request)
end

puts response.body

Get a Thumbnail Response


A successful response contains the thumbnail image binary. If the request failed, the response contains an error
code and a message to help determine what went wrong.

Optical Character Recognition (OCR) with Computer Vision API Using


Ruby
Use the Optical Character Recognition (OCR) method to detect text in an image and extract recognized characters
into a machine-usable character stream.
OCR Ruby Example Request
Change the REST URL to use the location where you obtained your subscription keys, replace the "Ocp-Apim-
Subscription-Key" value with your valid subscription key, and add a URL to a photograph of a celebrity to the body
variable.
require 'net/http'

# NOTE: You must use the same location in your REST call as you used to obtain your subscription keys.
# For example, if you obtained your subscription keys from westus, replace "westcentralus" in the
# URL below with "westus".
uri = URI('https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/ocr')
uri.query = URI.encode_www_form({
# Request parameters
'language' => 'unk',
'detectOrientation ' => 'true'
})

request = Net::HTTP::Post.new(uri.request_uri)
# Request headers
request['Content-Type'] = 'application/json'
# NOTE: Replace the "Ocp-Apim-Subscription-Key" value with a valid subscription key.
request['Ocp-Apim-Subscription-Key'] = '{subscription key}'
# Replace with the body, for example, "{\"url\": \"http://www.example.com/images/image.jpg\"}"
request.body = "{body}"

response = Net::HTTP.start(uri.host, uri.port, :use_ssl => uri.scheme == 'https') do |http|


http.request(request)
end

puts response.body

OCR Example Response


Upon success, the OCR results returned include text, bounding box for regions, lines, and words.
{
"language": "en",
"textAngle": -2.0000000000000338,
"orientation": "Up",
"regions": [
{
"boundingBox": "462,379,497,258",
"lines": [
{
"boundingBox": "462,379,497,74",
"words": [
{
"boundingBox": "462,379,41,73",
"text": "A"
},
{
"boundingBox": "523,379,153,73",
"text": "GOAL"
},
{
"boundingBox": "694,379,265,74",
"text": "WITHOUT"
}
]
},
{
"boundingBox": "565,471,289,74",
"words": [
{
"boundingBox": "565,471,41,73",
"text": "A"
},
{
"boundingBox": "626,471,150,73",
"text": "PLAN"
},
{
"boundingBox": "801,472,53,73",
"text": "IS"
}
]
},
{
"boundingBox": "519,563,375,74",
"words": [
{
"boundingBox": "519,563,149,74",
"text": "JUST"
},
{
"boundingBox": "683,564,41,72",
"text": "A"
},
{
"boundingBox": "741,564,153,73",
"text": "WISH"
}
]
}
]
}
]
}
Computer Vision API C# Tutorial
5/25/2017 • 4 min to read • Edit Online

Explore a basic Windows application that uses Computer Vision API to perform optical character recognition (OCR),
create smart-cropped thumbnails, plus detect, categorize, tag and describe visual features, including faces, in an
image. The below example lets you submit an image URL or a locally stored file. You can use this open source
example as a template for building your own app for Windows using the Vision API and WPF (Windows
Presentation Foundation), a part of .NET Framework.
Platform requirements
Prerequi si tes

The below example has been developed for the .NET Framework using Visual Studio 2015, Community Edition.
Subscribe to Computer Vision API and get a subscription key
Before creating the example, you must subscribe to Computer Vision API which is part of the Microsoft Cognitive
Services (formerly Project Oxford). For subscription and key management details, see Subscriptions. Both the
primary and secondary key can be used in this tutorial.

NOTE
The tutorial is designed to use subscription keys in the westcentralus region. The subscription keys generated in the
Computer Vision free trail use the westcentralus region, so they work correctly. If you generated your subscription keys
using your Azure account through https://azure.microsoft.com/, you must specify the westcentralus region. Keys generated
outside the westcentralus region will not work.

Get the client library and example


You may clone the Computer Vision API client library and example application to your computer via SDK. Don't
download it as a ZIP.
In your GitHub Desktop, open Sample-WPF\VisionAPI-WPF-Samples.sln.
Step 1: I nstal l the exampl e

Press Ctrl+Shift+B, or click Build on the ribbon menu, then select Build Solution.
Step 2: Bui l d the exampl e

1. After the build is complete, press F5 or click Start on the ribbon menu to run the example.
Step 3: Run the exampl e

2. Locate the Computer Vision API user interface window with the text edit box reading "Paste your
subscription key here to start". You can choose to persist your subscription key on your PC or laptop by
clicking the "Save Key" button. When you want to delete the subscription key from the system, click "Delete
Key" to remove it from your PC or laptop.

3. Under "Select Scenario" click to use one of the six scenarios, then follow the instructions on the screen.
Microsoft receives the images you upload and may use them to improve Computer Vision API and related
services. By submitting an image, you confirm that you have followed our Developer Code of Conduct.
4. There are example images to be used with this example application. You can find these images on the Face
API Windows Github repo, in the Data folder. Please note the use of these images is licensed under
agreement LICENSE-IMAGE.
Now that you have a running application, let us review how this example app integrates with Cognitive Services
Revi ew and Learn

technology. This will make it easier to either continue building onto this app or develop your own app using
Microsoft Computer Vision API.
This example app makes use of the Computer Vision API Client Library, a thin C# client wrapper for the Microsoft
Computer Vision API. When you built the example app as described above, you got the Client Library from a NuGet
package. You can review the Client Library source code in the folder titled “Client Library” under Vision,
Windows, Client Library, which is part of the downloaded file repository mentioned above in Prerequisites.
You can also find out how to use the Client Library code in Solution Explorer: Under VisionAPI-WPF_Samples,
expand AnalyzePage.xaml to locate AnalyzePage.xaml.cs, which is used for submitting an image to the image
analysis endpoint. Double-click the .xaml.cs files to have them open in new windows in Visual Studio.
Reviewing how the Vision Client Library gets used in our example app, let's look at two code snippets from
AnalyzePage.xaml.cs. The file contains code comments indicating “KEY SAMPLE CODE STARTS HERE” and “KEY
SAMPLE CODE ENDS HERE” to help you locate the code snippets reproduced below.
The analyze endpoint is able to work with either an image URL or binary image data (in form of an octet stream) as
input. First, you find a using directive, which lets you use the Vision Client Library.
// ----------------------------------------------------------------------
// KEY SAMPLE CODE STARTS HERE
// Use the following namespace for VisionServiceClient
// ----------------------------------------------------------------------
using Microsoft.ProjectOxford.Vision;
using Microsoft.ProjectOxford.Vision.Contract;
// ----------------------------------------------------------------------
// KEY SAMPLE CODE ENDS HERE
// ----------------------------------------------------------------------

UploadAndAnalyzeImage(…) This code snippet shows how to use the Client Library to submit your subscription
key and a locally stored image to the analyze endpoint of the Computer Vision API service.

private async Task<AnalysisResult> UploadAndAnalyzeImage(string imageFilePath)


{
// -----------------------------------------------------------------------
// KEY SAMPLE CODE STARTS HERE
// -----------------------------------------------------------------------
//
// Create Project Oxford Computer Vision API Service client
//
VisionServiceClient VisionServiceClient = new VisionServiceClient(SubscriptionKey);
Log("VisionServiceClient is created");

using (Stream imageFileStream = File.OpenRead(imageFilePath))


{
//
// Analyze the image for all visual features
//
Log("Calling VisionServiceClient.AnalyzeImageAsync()...");
VisualFeature[] visualFeatures = new VisualFeature[] { VisualFeature.Adult, VisualFeature.Categories, VisualFeature.Color,
VisualFeature.Description, VisualFeature.Faces, VisualFeature.ImageType, VisualFeature.Tags };
AnalysisResult analysisResult = await VisionServiceClient.AnalyzeImageAsync(imageFileStream, visualFeatures);
return analysisResult;
}

// -----------------------------------------------------------------------
// KEY SAMPLE CODE ENDS HERE
// -----------------------------------------------------------------------
}

AnalyzeUrl(…) This code snippet shows how to use the Client Library to submit your subscription key and a photo
URL to the analyze endpoint of the Computer Vision API service.
private async Task<AnalysisResult> AnalyzeUrl(string imageUrl)
{
// -----------------------------------------------------------------------
// KEY SAMPLE CODE STARTS HERE
// -----------------------------------------------------------------------

//
// Create Project Oxford Computer Vision API Service client
//
VisionServiceClient VisionServiceClient = new VisionServiceClient(SubscriptionKey);
Log("VisionServiceClient is created");

//
// Analyze the url for all visual features
//
Log("Calling VisionServiceClient.AnalyzeImageAsync()...");
VisualFeature[] visualFeatures = new VisualFeature[] { VisualFeature.Adult, VisualFeature.Categories, VisualFeature.Color,
VisualFeature.Description, VisualFeature.Faces, VisualFeature.ImageType, VisualFeature.Tags };
AnalysisResult analysisResult = await VisionServiceClient.AnalyzeImageAsync(imageUrl, visualFeatures);
return analysisResult;
}
// -----------------------------------------------------------------------
// KEY SAMPLE CODE ENDS HERE
// -----------------------------------------------------------------------

Other pages and endpoints How to interact with the other endpoints exposed by the Computer Vision API service
can be seen by looking at the other pages in the sample; for instance, the OCR endpoint is shown as part of the
code contained in OCRPage.xaml.cs
Rel ated Topi cs

Get started with Face API


Computer Vision API Python Tutorial
5/25/2017 • 1 min to read • Edit Online

This tutorial shows you how to use the Computer Vision API in Python and how to visualize your results using some
popular libraries. Use Jupyter to run the tutorial. To learn how to get started with interactive Jupyter notebooks,
refer to: Jupyter Documementation.
Opening the Tutorial Notebook in Jupyter
1. Navigate to the tutorial notebook in GitHub.
2. Click on the green button to clone or download the tutorial.
3. Open a command prompt and go to the folder Cognitive-Vision-Python-master\Jupyter Notebook.
4. Run the command jupyter notebook from the command prompt. This will start Jupyter.
5. In the Jupyter window, click on Computer Vision API Example.ipynb to open the tutorial notebook
Running the Tutorial
To use this notebook, you will need a subscription key for the Computer Vision API. Visit the Subscription page to
sign up. On the “Sign in” page, use your Microsoft account to sign in and you will be able to subscribe and get free
keys. After completing the sign-up process, paste your key into the variables section of the notebook (reproduced
below). Either the primary or the secondary key works. Make sure to enclose the key in quotes to make it a string.

# Variables

_url = 'https://westcentralus.api.cognitive.microsoft.com/vision/v1/analyses'
_key = None #Here you have to paste your primary key
_maxNumRetries = 10
86-Categories Taxonomy
4/12/2017 • 1 min to read • Edit Online

abstract_
abstract_net
abstract_nonphoto
abstract_rect
abstract_shape
abstract_texture
animal_
animal_bird
animal_cat
animal_dog
animal_horse
animal_panda
building_
building_arch
building_brickwall
building_church
building_corner
building_doorwindows
building_pillar
building_stair
building_street
dark_
drink_
drink_can
dark_fire
dark_fireworks
sky_object
food_
food_bread
food_fastfood
food_grilled
food_pizza
indoor_
indoor_churchwindow
indoor_court
indoor_doorwindows
indoor_marketstore
indoor_room
indoor_venue
dark_light
others_
outdoor_
outdoor_city
outdoor_field
outdoor_grass
outdoor_house
outdoor_mountain
outdoor_oceanbeach
outdoor_playground
outdoor_railway
outdoor_road
outdoor_sportsfield
outdoor_stonerock
outdoor_street
outdoor_water
outdoor_waterside
people_
people_baby
people_crowd
people_group
people_hand
people_many
people_portrait
people_show
people_tattoo
people_young
plant_
plant_branch
plant_flower
plant_leaves
plant_tree
object_screen
object_sculpture
sky_cloud
sky_sun
people_swimming
outdoor_pool
text_
text_mag
text_map
text_menu
text_sign
trans_bicycle
trans_bus
trans_car
trans_trainstation
Computer Vision API Frequently Asked Questions
4/18/2017 • 2 min to read • Edit Online

If you can't find answers to your questions in this FAQ, try asking the Computer Vision API community on
StackOverflow or contact Help and Support on UserVoice
Question: Can I train Computer Vision API to use custom tags? For example, I would like to feed in pictures of cat
breeds to 'train' the AI, then receive the breed value on an AI request.
Answer: This function is currently not available. However, our engineers are working to bring this functionality to
Computer Vision.

Question: Can Computer Vision be used locally without an internet connection?


Answer: We currently do not offer an on-premise or local solution.

Question: Which languages are supported with Computer Vision?


Answer: Supported languages include:

SUPPORTED
LANGUAGES

Danish (da-DK) Dutch (nl-NL) English Finnish (fi-FI) French (fr-FR)

German (de-DE) Greek (el-GR) Hungarian (hu-HU) Italian (it-IT) Japanese (ja-JP)

Korean (ko-KR) Norwegian (nb-NO) Polish (pl-PL) Portuguese (pt-BR) Russian (ru-RU)
(pt-PT)

Spanish (es-ES) Swedish (sv-SV) Turkish (tr-TU)

Question: Can Computer Vision be used to read license plates?


Answer: The Vision API offers good text-detection with OCR, but it is not currently optimized for license plates. We
are constantly trying to improve our services and have added OCR for auto license plate recognition to our list of
feature requests.

Question: Which languages are supported for handwriting recognition?


Answer: Currently, only English is supported.

Question: What types of writing surfaces are supported for handwriting recognition?
Answer: The technology works with different kinds of surfaces, including whiteboards, white paper, and yellow
sticky notes.

Question: How long does the handwriting recognition operation take?


Answer: The amount of time that it takes depends on the length of the text. For longer texts, it can take up to
several seconds. Therefore, after the Recognize Handwritten Text operation completes, you may need to wait before
you can retrieve the results using the Get Handwritten Text Operation Result operation.
Question: How does the handwriting recognition technology handle text that was inserted using a caret in the
middle of a line?
Answer: Such text is returned as a separate line by the handwriting recognition operation.

Question: How does the handwriting recognition technology handle crossed-out words or lines?
Answer: If the words are crossed out with multiple lines to render them unrecognizable, the handwriting
recognition operation doesn't pick them up. However, if the words are crossed out using a single line, that crossing
is treated as noise, and the words still get picked up by the handwriting recognition operation.

Question: What text orientations are supported for the handwriting recognition technology?
Answer: Text oriented at angles of up to around 30 degrees to 40 degrees may get picked up by the handwriting
recognition operation.
The Research Behind Computer Vision API
4/12/2017 • 1 min to read • Edit Online

Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh Srivastava, Li Deng, Piotr Dollar, Jianfeng Gao, Xiaodong He,
Margaret Mitchell, John Platt, Lawrence Zitnick, and Geoffrey Zweig, From Captions to Visual Concepts and Back,
CVPR, June 2015 (Won 1st Prize at the COCO Image Captioning Challenge 2015)
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition. arXiv
(Won both ImageNet and MS COCO competitions 2015)
Yandong Guo, Lei Zhang, Yuxiao Hu, Xiaodong He, Jianfeng Gao, MS-Celeb-1M: Challenge of Recognizing One
Million Celebrities in the Real World, IS&T International Symposium on Electronic Imaging, 2016
Xiao Zhang,Lei Zhang, Xin-Jing Wang, Heung-Yeung Shum, Finding Celebrities in Billions of Web Images, accepted
by IEEE Transaction on Multimedia, 2012

You might also like