0% found this document useful (0 votes)
143 views164 pages

Developing Met A Web Apps

Uploaded by

divyang_99
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
143 views164 pages

Developing Met A Web Apps

Uploaded by

divyang_99
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 164

Developing Metaweb-Enabled Web

Applications

Metaweb Technologies, Inc.


Developing Metaweb-Enabled Web Applications
Metaweb Technologies, Inc.

Published 2007-03-08
Copyright © 2007 Metaweb Technologies, Inc.
Table of Contents
1. Introduction .................................................................................................................. 1
1.1. The Metaweb Query API ..................................................................................... 1
1.2. About this Manual .............................................................................................. 3
2. Metaweb Architecture .................................................................................................... 5
2.1. The Metaweb Object Model ................................................................................ 5
2.1.1. Common Object Properties ....................................................................... 7
2.1.2. Names, Keys, and Ids ............................................................................... 8
2.1.3. Topics ..................................................................................................... 8
2.2. Values ................................................................................................................ 9
2.2.1. /type/int .................................................................................................. 9
2.2.2. /type/float ................................................................................................ 9
2.2.3. /type/boolean ......................................................................................... 10
2.2.4. /type/id .................................................................................................. 10
2.2.5. /type/text ............................................................................................... 10
2.2.6. /type/key ............................................................................................... 10
2.2.7. /type/rawstring ....................................................................................... 11
2.2.8. /type/uri ................................................................................................ 11
2.2.9. /type/datetime ........................................................................................ 11
2.3. Types ............................................................................................................... 13
2.3.1. Core Types ............................................................................................ 14
2.3.2. Content Types ........................................................................................ 15
2.3.3. Access Control Types ............................................................................. 16
2.4. Domains .......................................................................................................... 17
2.5. Namespaces ..................................................................................................... 17
2.6. Access Control ................................................................................................. 18
3. The Metaweb Query Language ..................................................................................... 19
3.1. JavaScript Object Notation ................................................................................ 19
3.1.1. JSON Literals: null, true, false ................................................................ 20
3.1.2. JSON Numbers ...................................................................................... 20
3.1.3. JSON Strings ......................................................................................... 20
3.1.4. JSON Arrays ......................................................................................... 21
3.1.5. JSON Objects ........................................................................................ 22
3.2. MQL Tutorial ................................................................................................... 23
3.2.1. Our First Query ..................................................................................... 25
3.2.2. Query/Response Symmetry .................................................................... 26
3.2.3. Metaweb Object IDs .............................................................................. 26
3.2.4. Multiple Results and Uniqueness Errors .................................................. 27
3.2.5. Nested Queries ...................................................................................... 30
3.2.6. Asking Metaweb For Objects .................................................................. 31
3.2.7. Expanded Values and Default Properties .................................................. 34
3.2.8. Review: Asking for Values ...................................................................... 35
3.2.9. Too Much Information ........................................................................... 36
3.2.10. The id and name Properties ................................................................... 37
3.2.11. Numeric Constraints ............................................................................. 39
3.2.12. Textual Constraints: Pattern Matching in Queries ................................... 41
3.2.13. Limiting Queries .................................................................................. 43

iii
3.2.14. The Sort Directive ................................................................................ 44
3.2.15. Ordered Collections ............................................................................. 46
3.2.16. Optional Queries .................................................................................. 48
3.2.17. Using Fully-Qualified Property Names .................................................. 49
3.2.18. Wildcards ............................................................................................ 51
3.2.19. Expressing AND in Queries .................................................................. 53
3.2.20. Expressing OR in Queries ..................................................................... 55
3.2.21. Expressing NOT in Queries .................................................................. 57
3.2.22. Reflective Queries ................................................................................ 58
3.3. The MQL Grammar .......................................................................................... 60
4. Metaweb Read Services ............................................................................................... 63
4.1. Basic mqlread Queries with Perl ........................................................................ 63
4.1.1. A Better Perl Album Lister ..................................................................... 64
4.2. The mqlread Service ......................................................................................... 67
4.2.1. mqlread Input ........................................................................................ 67
4.2.2. mqlread Output ...................................................................................... 68
4.2.3. Query and Response Envelopes .............................................................. 68
4.3. A Python Album Lister ...................................................................................... 70
4.4. A Metaweb-enabled PHP Web Application ......................................................... 72
4.5. Metaweb Queries with JavaScript ...................................................................... 74
4.5.1. Listing Albums and Tracks with JavaScript .............................................. 75
4.5.2. Client-side MQL Queries with <script> ................................................... 80
4.6. mqlread Errors ................................................................................................. 81
4.7. mqlread Cursors ............................................................................................... 82
4.8. Fetching Content with trans ............................................................................... 85
4.8.1. Browsing Recent Content on freebase.com .............................................. 86
4.9. Example: A Metaweb Type Browser .................................................................. 89
5. The MQL Write Grammar ............................................................................................ 95
5.1. MQL Write Tutorial .......................................................................................... 95
5.1.1. Creating a Type to Work With ................................................................. 95
5.1.2. Creating Objects .................................................................................... 97
5.1.3. Connecting Objects ................................................................................ 99
5.1.4. Disconnecting Objects .......................................................................... 102
5.1.5. Writes and Default Properties ............................................................... 104
5.1.6. Creating and Connecting More Objects ................................................. 106
5.1.7. Review: Write Directives ...................................................................... 111
5.1.8. Working with Sets ................................................................................ 112
5.1.9. Bidirectional Links and Reciprocal Properties ........................................ 114
5.1.10. Writes and Ordered Collections ........................................................... 118
5.1.11. Namespaces ....................................................................................... 121
5.1.12. Properties, Types, and Domains ........................................................... 127
5.2. MQL Write Grammar ..................................................................................... 131
6. Metaweb Write Services ............................................................................................ 133
6.1. Logging in to Metaweb ................................................................................... 133
6.1.1. The Login API ..................................................................................... 135
6.2. Making Write Queries ..................................................................................... 135
6.2.1. The mqlwrite Query Envelope ............................................................... 135
6.2.2. The Response Envelope ........................................................................ 136
6.2.3. A mqlwrite Utility Function .................................................................. 137

iv Developing Metaweb-Enabled Web Applications


6.2.4. Example: US State Quarters .................................................................. 138
6.2.5. Sending Multiple Queries to mqlwrite ................................................... 140
6.3. Uploading Data to Metaweb ............................................................................ 140
6.3.1. An Upload Utility ................................................................................. 141
6.3.2. Examples: Uploading Images of State Quarters ...................................... 142
6.3.3. Uploading Documents .......................................................................... 143
6.4. Example: Unlinking Objects ............................................................................ 144
A. Additional Code ........................................................................................................ 149
A.1. json.js ........................................................................................................... 149
A.2. Client-side MQL Queries through a Proxy ....................................................... 151
A.3. Example: Auto-completion with mqlread ......................................................... 153

Developing Metaweb-Enabled Web Applications v


vi
List of Examples
3.1. qedit.html: Code for a Metaweb query editor .............................................................. 23
4.1. albumlist.pl: submitting MQL queries in Perl ............................................................. 64
4.2. albumlist2.pl: a better Perl album lister ...................................................................... 65
4.3. metaweb.py: using mqlread with Python .................................................................... 70
4.4. albumlist.py: listing albums in Python ........................................................................ 71
4.5. metaweb.php: using mqlread with PHP ...................................................................... 72
4.6. albumlist.php: A Metaweb-enabled web application in PHP ........................................ 74
4.7. albumlist.html: a JavaScript album and track lister ...................................................... 76
4.8. metaweb.js: Metaweb queries with script tags ............................................................. 80
4.9. metaweb.py: querying Metaweb with a cursor, in Python ............................................. 83
4.10. WhatsNew.html: fetching new images and documents from freebase.com ................... 87
4.11. TypeBrowser.html: a Metaweb type browser ............................................................. 90
6.1. metaweb.py: logging in to Metaweb with Python ...................................................... 134
6.2. metaweb.py: sending a query to mqlwrite ................................................................. 137
6.3. quarters.py: writing a data set to Metaweb ................................................................ 139
6.4. metaweb.py: uploading content to Metaweb .............................................................. 141
6.5. quarterpix.py: uploading images to Metaweb ............................................................ 142
6.6. uploaddoc.py: uploading HTML documents to Metaweb ........................................... 143
6.7. unlink.py: unlinking Metaweb objects ...................................................................... 144
A.1. json.js: JSON parsing and serialization in JavaScript ................................................ 149
A.2. metaweb_proxy.js: Metaweb queries through a proxy ............................................... 151
A.3. mqlread.php: a trivial mqlread proxy in PHP ............................................................ 153
A.4. completiontest.html: using Metaweb.addValidationAndCompletion() ........................ 153
A.5. validateAndComplete.js: form validation and completion with mqlread ..................... 154

vii
viii
Chapter 1. Introduction
Freebase is a vast, free, open online database of structured knowledge, powered and maintained
by Metaweb Technologies (metaweb.com). Users can access and contribute to Freebase at ht-
tp://www.freebase.com, or through the Metaweb API explained in this manual. If you visit the
freebase.com website, you'll find that Metaweb has seeded the database with detailed information
about popular music and movies. Figure 1.1 is a sample page from this site:

Figure 1.1. Browsing knowledge at freebase.com

This manual teaches you how to write Metaweb-enabled programs that interact with Freebase.
It assumes that you already know the "what" and "why" of Freebase, and that you have read the
documentation topics (such as "What is Freebase" and "Freebase Demo") linked from the Freebase
home page http://www.freebase.com/view/.

Metaweb and Freebase

Metaweb (the company) has developed Metaweb (the technology and API). Freebase (the
open global structured knowledge base) is a high-profile public instantiation of the Metaweb
technology, but is unlikely to be the only instantiation.

This manual documents general Metaweb services and APIs, and relies on Freebase for
example data and example applications. The services and APIs are applicable beyond
Freebase, however, and you'll find that this manual uses the name "Metaweb" far more
than it does the name "Freebase".

1.1. The Metaweb Query API


Metaweb offers a powerful API for making programmatic queries. This allows you to incorporate
knowledge from the Freebase database into your own applications and websites. Let's take this
API for a spin. Type the following URL into your web browser's location bar. (Type it all on a
single line: it is broken across two lines here to fit on the printed page.)

1
http://www.freebase.com/api/service/mqlread?queries={"albums":{"query":
{"type":"/music/artist","name":"The Police","album":[]}}}

There are a lot of braces, quote marks, colons and commas in that URL, but remember that this
is a programmatic API: the query is supposed to be generated by a computer, not pecked out by
human fingers! Translated into English, this query says:

Find an object in the database whose type is "/music/artist" and whose name is
"The Police". Then return its array of albums.

If you got all the punctuation correct, the Metaweb server will respond to this query with a response
of MIME type application/json. The response is plain text, but your browser will probably
not display it to you. Instead, the browser will allow you to save it to a file, which you can then
view from the command line or with any text editor. When you view it, you'll see something like
this:

{
"status": "200 OK",
"query": {
"album": [],
"type": "/music/artist",
"name": "The Police"
},
"messages": [],
"result": {
"album": [
"Outlandos d'Amour",
"Reggatta de Blanc",
"Live in Boston",
"Zenyatta Mondatta",
"Ghost in the Machine",
"Synchronicity",
],
"type": "/music/artist",
"name": "The Police"
}
}

Cookie-Based Authentication

While Metaweb is rolling out and scaling up the freebase.com website, queries like the one
shown above can only be made by users who have registered for a Freebase account.
Metaweb uses Cookie based authentication. This means that if you have logged on at
www.freebase.com, your web browser will have the cookies it needs for the query URL
above to work. But if you enter that URL on a web browser that has never visited Freebase
before, you'll get a HTTP "401 Unauthorized" error.

Once freebase.com is fully deployed, read queries like this will be open to the world and
no cookies will be required.

2 Developing Metaweb-Enabled Web Applications


The response has the same braces and quotes that the query did: they provide the structure that
makes this response easy to parse (for a computer). This response begins with an HTTP status
code. It repeats the query we made, and then provides the response to our query. Our query in-
cluded the text:

"album":[]

In the response, those empty square brackets have been filled in with a long list of album names.
(For brevity, a number of live and compilation albums were omitted from the list shown above.)

Making queries from your web browser's location bar is interesting, but it becomes much more
interesting if we make the queries under programmatic control. Imagine a script running on your
own web server that sends queries to Freebase and formats the results as HTML: this is a Metaweb-
enabled web application. It might look like Figure 1.2.

Figure 1.2. A Metaweb-enabled web application

1.2. About this Manual


The goal of this manual is to explain everything you need to know to create Metaweb-enabled
web applications like the one pictured in Figure 1.2. We assume that you are a programmer who
has some experience with languages like PHP, Python, JavaScript and Java, and that you under-
stand the basics of the HTTP protocol.

The chapters that follow are:

Chapter 2: Metaweb Architecture


This chapter explains the Metaweb architecture, including a discussion of Metaweb objects,
values, types, domains, namespaces and access control.

Chapter 1. Introduction 3
Chapter 3: The Metaweb Query Language
This chapter explains the Metaweb Query Language (MQL) that is used to express Metaweb
queries. The syntax is quite a bit more powerful and complex than what was shown in this
introduction.

Chapter 4: Metaweb Read Services


This chapter explains how to read data from Metaweb servers. It demonstrates (with examples
in a variety of scripting languages) how to submit MQL queries to the mqlread service, and
how to interpret the response. It also demonstrates how to use the trans service to download
content (such as HTML articles or binary image data) from Metaweb.

This chapter includes the source code for the application that created Figure 1.2.

Chapter 5: The MQL Write Grammar


This chapter explains MQL syntax for writing to the Metaweb database. The MQL write
grammar resembles MQL for read queries, but is different in a number of important ways.

Chapter 6: Metaweb Write Services


This chapter explains how to write to Metaweb servers. It demonstrates (with Python code)
the mqlwrite service for sending write queries to Metaweb, and the upload service for upload-
ing textual or binary content. In addition, it shows how to use the login service, which is a
necessary prerequisite for mqlwrite and upload.

4 Developing Metaweb-Enabled Web Applications


Chapter 2. Metaweb Architecture
A Metaweb database is a sea of knowledge organized as a graph: a set of nodes and a set of links
or relationships between those nodes. This chapter covers the key features of the Metaweb archi-
tecture, and explains how Metaweb types and properties tame this vast graph of knowledge by
defining a manageable object-oriented view of it.

2.1. The Metaweb Object Model


Figure 2.1 illustrates one tiny (hypothetical) piece of the Metaweb graph of nodes and relationships.

Figure 2.1. Nodes and relationships

This portion of the Metaweb graph organizes knowledge about something named "Arnold". It
tells us that Arnold is a Person, Politician, Body Builder, and Actor. It tells us that Arnold's
country of birth is Austria, his political party is Republican, and that he acted in something named
"Terminator" (which is an instance of something known as a "Film"). The relationships in the
graph are bi-directional, so this figure also tells us, for example, that Austria has Arnold as a
citizen, the Republican Party has Arnold as a member, and that Terminator has Arnold as a cast
member. (Note that this is an example only. An "Arnold Schwarzenegger" node does exist at
www.freebase.com, but it may nor may not have the particular relationships pictured here.

This nodes-and-relationships representation of knowledge is ideal for searching algorithms, but


is not ideal for human understanding: we quickly become lost in the maze of links. In order to
make the data structure more understandable to humans, Metaweb allows us to view the graph
through an object-oriented lens. Rather than thinking about nodes and their relationships to other
nodes, this object-oriented view lets us think about objects and their properties:

5
Arnold
sex: male
birthdate: 1947-July-30
country of birth: Austria
political party: Republican
film: Conan the Barbarian
film: Terminator
film: Kindergarten Cop
elected office: Governor of California

In this view, Arnold is an object with a set of properties. Each property has a name and a value.
What is missing from the view is any kind of typing. In many object-oriented systems, each
property of an object has a known type, and the value of that property must be a member of that
type.

Look back at Figure 2.1 again, and consider the relationships labeled type and instances. Arnold
is an instance of Person, Actor, and Politician. Person, Actor, and Politician are Metaweb types:
they are nodes in the Metaweb graph, but they also impose an object-oriented structure on the
graph. Each type defines a set of properties that its instances are expected to have. Each property
has a name and a type. An object in a Metaweb database, therefore is a node in the graph, plus
the type that it should be viewed as:

Arnold as Person Arnold as Politician


Sex sex: male ElectedOffice office: Governor of CA
Date birthdate: 1947-July-30
Country birthplace: Austria

Next, let's consider Arnold as an Actor. Notice that the list of properties above included three
properties named film. This is perfectly fine for a nodes-and-relationships model, but it doesn't
fit our object-oriented model where we expect each property to have a single value. A Metaweb
type may specify whether each of its properties must be unique or not. For the Actor type, we'd
need a non-unique property named film. The type of this property is a set of films that Arnold
has acted in:

Arnold as Actor
Set of Film film: [Conan the Barbarian,
Kindergarten Cop,
Terminator]

Note that the film property is an unordered set of values, not an ordered list of values. If you
wanted to display this set of films to an end user, you would most likely want to arrange them
into alphabetical order, or by release date. You can ask Metaweb to order them for you, or you
can sort them yourself. Some sets, such as the set of tracks on an album have an implicit order,
and you can ask Metaweb to return the members of the set in this order. We'll see how to do this
in Chapter 3.

6 Developing Metaweb-Enabled Web Applications


2.1.1. Common Object Properties
All Metaweb objects, regardless of their type or types, define the following properties:

name This property is a set of human-readable names for the object, suitable for display
to the end users of Metaweb. Each name is a /type/text value which holds a
string and defines the human language in which it is written. The name property
is special in two ways:

• An object may have more than one name, but may only have one name per
language. That is, it can have only one English name, only one French name,
and so on.

• When querying Metaweb, you may treat the name property as if it was a single
/type/text value rather than a set of values. Metaweb will automatically
return the object's name (if it has one) in your language of choice.

key This property is a set of fully-qualified names for the object. These fully-quali-
fied names are intended for use by developers and scripts and are not typically
displayed to end users. Each member of the set is a /type/key value that spe-
cifies a namespace object and a name within the namespace. Metaweb guarantees
that no two objects will ever have the same fully-qualified name.

guid Every object in a Metaweb database has a globally unique identifier or guid.
The guid property specifies the unique identifier for an object. A guid is a long
string of hexadecimal digits following the hash character, and might look like
this: #0801010a40005e838000000000019bd2. No two objects will ever have the
same value of the guid property. This property is read-only.

id The id property is used to uniquely identify an object either by its guid, or by


a fully-qualified name defined by a key property. If you query the id property
of an object, Metaweb usually returns the guid of the object to you. For objects
that are instances of core types, Metaweb returns a fully-qualified name (such
as /type/text or /lang/en) instead of a guid. Like guid, the id property is
unique: no two objects will ever have the same value for this property. This
property is read-only. You may not set the id property. (If you add a key property
to an object, however, you will be able to refer to the object through a new
fully-qualified name.)

type This property is the set of types associated with the object. The object can be
viewed as an instance of any of these types. Each type is itself a Metaweb object,
of /type/type.

timestamp This read-only property is a single value of /type/datetime that specifies when
the object was created.

creator This read-only property is a single link to a /type/user object that specifies
which Metaweb user created the object.

Chapter 2. Metaweb Architecture 7


permission This read-only property is a single link to a /type/permission object. A per-
mission object specifies which Metaweb usergroups are allowed to alter the
object. See Section 2.6 for more on users, usergroups and permissions.

2.1.2. Names, Keys, and Ids


Notice that four of the eight common properties described above have to do with names and
identifiers for Metaweb objects. It is important to understand the difference between human-
readable names, fully-qualified names, and guids.

A Metaweb database contains an object that represents the human language English. The name
property of this object specifies its human-readable name: "English". Metaweb objects can have
only a single name in each language. Our English object might have names "Anglais" and "Ingles"
in French and Spanish. It is important to understand that the human-readable name of an object
does not uniquely identify it: there may be many other Metaweb objects with the name "English".

Because the name property allows only one name in each language, you cannot use it to specify
nicknames for an object. You cannot, for example, give the English object the name "American
English" in addition to "English". As we'll see below, most Metaweb objects that are intended
for display to end-users are instances of a type called /common/topic. This type defines a property
named alias, which you can use to specify any number of nicknames for an object.

The key property of the English object is completely different than the name property. It specifies
that the object has the name "en" in a particular namespace object. That namespace object has a
key property of its own, which specifies that it has the name "lang" in a special root namespace
object. Metaweb uses the slash character to delimit names, so the English object has the fully-
qualified name "/lang/en". Fully-qualified names are intended for developers and are often used
in code, so we'll usually write them in code font like this: /lang/en.

The critical thing about fully-qualified names is that they are unique. Metaweb ensures that no
two objects ever have the same fully-qualified name at the same time.

Human-readable names and fully-qualified names are optional; Metaweb objects are not required
to have either. But every object does have a guid value that identifies it uniquely. A unique guid
is assigned to an object when it is created, and it never changes. It is always possible to uniquely
identify an object by specifying the value of its guid property. The guid of the /lang/en object
is "#9202a8c04000641f8000000000000092"

Guids and fully-qualified names are both unique identifiers for objects. The id property is flexible
and allows you to use either one. If you want to refer to the English object, you could specify an
id property of "#9202a8c04000641f8000000000000092" or "/lang/en".

2.1.3. Topics
Objects that are displayed to users of freebase.com are called topics. These are regular Metaweb
objects that are members of the type /common/topic in addition to any of their other, more-spe-
cific types. /common/topic defines properties that allow descriptions, nicknames, documents and
images to be associated with an object, and the freebase.com client uses these properties to as-
semble an informative web page that describes the object or topic.

8 Developing Metaweb-Enabled Web Applications


All topics in Metaweb are also objects. But not all objects are topics. The distinction is that topics
are entries that might be of interest to end users. Objects that are not topics are typically part of
the Metaweb infrastructure, and may be of interest to Metaweb developers but not end users.
Types, properties, domains and namespaces are not topics, but albums, movies, and restaurants
are.

2.2. Values
Like many object-oriented programming languages, Metaweb draws a distinction between objects
(arbitrary collections of properties) and values (single primitives such as numbers, dates and
strings). Metaweb defines nine value types. Like all Metaweb types, value types are identified
by type objects. Each type object has a fully-qualified name such as /type/int (for the value
type that represents integer values).

Values have a dual nature in Metaweb. Depending on how you ask about them, they may behave
like primitives, or like simple objects. If you query a value as if it were an object, then it behaves
like a simple object with two properties (as we'll see shortly, two of the value types actually include
a third property as well):

value this property holds the primitive value

type this property refers to the type object that specifies the type of the value.

If you query a value as a primitive, then just the value of the value property is returned.

The various Metaweb value types are described in the sub-sections that follow. Notice that value
types are in the /type domain, and that their names fall under the /type namespace. (We'll see
more about namespaces in Section 2.5.)

2.2.1. /type/int
Values of this type are signed integers. Metaweb uses a 64-bit representation internally, which
means that the range of valid values of /type/int is from -9223372036854775808 to
9223372036854775807. An integer literal is simply an optional minus sign followed by a sequence
of decimal digits. Metaweb does not support octal or hexadecimal notation for integers, nor does
it allow the use of exponential notation for expressing integers.

2.2.2. /type/float
Values of this type are signed numbers that may include an integer part, a fractional part, and an
order of magnitude (a power of ten by which the integer and fractional parts are multiplied.)
Metaweb uses the 64-bit IEEE-754 floating point representation which supports magnitudes
between 10-324 and 10308. C and Java programmers may recognize this as the double datatype.
Metaweb does not support the special values Infinity and NaN, however.

A literal of /type/float consists of an optional minus sign, and optional integer part, and optional
decimal point and fractional part and an optional exponent. The integer and fractional parts are

Chapter 2. Metaweb Architecture 9


simply strings of decimal digits. The exponent begins with the letter e or E, followed by an op-
tional minus sign, and one to three digits. The following are all valid /type/float literals:

1.0 # integer and fractional part


1 # integer part alone
.0 # fractional part alone
-1 # minus sign allowed as first character
1E-5 # exponent: 1 × 10-5 or 0.00001
5.98e24 # weight of earth in kg: 5.98 × 1024

There are an infinite number of real numbers, and a 64-bit representation can only describe a finite
subset of them. Any number with 12 or fewer significant digits can be stored and retrieved exactly
with no loss of precision. Numbers with more than 12 significant digits may have those digits
truncated when they are stored in Metaweb.

2.2.3. /type/boolean
There are only two values for this type; they represent the boolean truth values true and false.
Note that Metaweb sometimes uses the absence of a value (null) in place of false.

2.2.4. /type/id
Values of this type are object identifiers, either guids or fully-qualified names. The object prop-
erties guid and id have values of this type.

2.2.5. /type/text
An instance of /type/text is a string of text plus a value that specifies the human language of
that text. The name property of an object is a set of values of this type.

/type/text is unusual. Its value property specifies the text itself, but it also has a lang property
that specifies the language in which the text is written. The lang property refers to an object of
type /type/lang. The /lang namespace holds many instances of this type, such as /lang/en for
English. We'll say more about /type/lang and the /lang namespace later in this chapter.

The text of a /type/text value must be a string of Unicode characters, encoded using the UTF-
8 encoding. The encoded string must not occupy more than 4096 bytes. Longer chunks of text
(or binary data) can be stored in Metaweb in the form of a /type/content object, which is de-
scribed later.

2.2.6. /type/key
Instances of /type/key represent a fully-qualified name. The key property of an object is a set
of /type/key values. The value property of a /type/key value is the local, or unqualified part
of a fully-qualified name. Like /type/text, /type/key has a third property. The namespace
property of a key refers to the /type/namespace object that qualifies the local name. The
namespace property and the value property combine to produce a fully-qualified name.

10 Developing Metaweb-Enabled Web Applications


As an example, consider the Metaweb object that represents the value type /type/int. The key
property of this object has a value of "int", and a namespace that refers to the /type namespace.
The /type namespace is also an object, and its key property has a value of "type" and a namespace
that refers to the root namespace object.

The value property of a key must be a string of ASCII characters, and may include letters,
numbers, underscores, hyphens and dollar signs. A key may not begin or end with a hyphen or
underscore. The dollar sign is special: it must be followed by four hexadecimal digits (using letters
A through F, in uppercase), and is used when it is necessary to map Unicode characters into
ASCII so that they can be represented in a key. To represent an extended Unicode character (that
does not fit in four hexadecimal digits), encode that character in UTF-16 using a surrogate pair,
and then express the surrogate pair using two dollar-sign escapes.

Keys used as names for domains, types and properties are further restricted: they may not include
hyphens or dollar signs, and may not include two underscores in a row.

2.2.7. /type/rawstring
A value of /type/rawstring is a string of bytes with no associated language specification. The
length of the string must not exceed 4096 bytes.

Use /type/rawstring instead of /type/text for small amounts of binary data and for textual
strings that are not intended to be human readable.

2.2.8. /type/uri
An instance of /type/uri represents a URI (Uniform Resource Identifier: see RFC 3986). The
value property holds the URI text, which should consist entirely of ASCII characters. Any non-
ASCII characters, and any characters that are not allowed in URIs should be URI-encoded using
hexadecimal escapes of the form %XX to represent arbitrary bytes.

2.2.9. /type/datetime
An instance of /type/datetime represents an instant in time. That instant may be as long as a
year or as short as a fraction of a second. The value property is a a string representation of a date
and time formatted according to a subset of the ISO 8601 standard. /type/datetime only supports
dates specified using month and day of month. It does not support the ISO 8601 day-of-year,
week-of-year and day-of-week representations.

A /type/datetime value that represents the first millisecond of the 21st century looks like this:

2001-01-01T00:00:00.001Z

Notice the following points about this format:

• Longer intervals of time (years, months, etc.) are specified before shorter intervals (minutes,
seconds, etc.).

Chapter 2. Metaweb Architecture 11


• Years must be specified with a full four digits, even when the leading digits are zeros. Negative
years are allowed, but years with more than four digits are not allowed.

• Months and days must always be specified with two digits, starting with 01, even when the
first digit is a 0.

• The components of a date are separated from each other with hyphens.

• A date is separated from the time that follows with a capital letter T.

• Times are specified using a 24-hour clock. Midnight is hour 00, not hour 24. Hours and minutes
must be specified with two digits, even when the first digit is 0.

• Seconds must be specified with two digits, but may also include a decimal point and a fractional
second. Metaweb allows up to 9 digits after the decimal point.

• The hours, minutes, and seconds components of a time specification are separated from each
other with colons.

• A time may be followed by a timezone specification. The capital letter Z is special: it specifies
that the time is in Universal Time, or UTC (formerly known as GMT). Local timezones that
are later than UTC (east of the Greenwich meridian) are expressed as a positive offset of hours
and minutes such as +05:30 for India. Local times earlier than UTC are expressed with a neg-
ative offset such as -08:00 for US Pacific time. If no timezone is specified, then then the
/type/datetime value is assumed to be a local time in an unknown timezone. Specifying a
timezone of +00:00 is the same as specifying Z. Specifying -00:00 is the same as omitting the
timezone altogether.

• All characters used in the /type/datetime representation are from the ASCII character set,
so date and time values can be treated as strings of 8-bit ASCII characters.

A /type/datetime value can represent time at various granularities, and any of the date or time
fields on the right-hand side can be omitted to produce a value with a larger granularity. For ex-
ample, the seconds field can be omitted to specify a day, hour, and minute. Or all the time fields
and the day-of-month field can be omitted to specify just a year and a month. Also, the date fields
can be omitted to specify a time that is independent of date. A timezone may not be appended to
a date alone: there must be at least an hour field specified before a timezone.

Here are some example /type/datetime values that demonstrate the allowed formats:

2001 # The year 2001


2001-01 # January 2001
2001-01-01 # January 1st 2001
2001-01-01T01Z # 1 hour past midnight (UTC), January 1st 2001
2000-12-31T23:59Z # 1 minute before midnight (UTC) December 31st, 2000
2000-12-31T23:59:59Z # 1 second before midnight (UTC) December 31st, 2000
2000-12-31T23:59:59.9Z # .1 second before midnight (UTC) December 31st, 2000
00:00:00Z # Midnight, UTC
12:15 # Quarter past noon, local time
17-05:00 # Happy hour, Boston (US Eastern Standard Time)

12 Developing Metaweb-Enabled Web Applications


2.3. Types
Types that are not value types are object types. Metaweb pre-defines a number of object types,
organized into domains of related types. Metaweb users are allowed (and encouraged) to define
new object types as needed. Pre-defined object types can be categorized into the core types that
are part of the Metaweb infrastructure, common types that are used commonly throughout Metaweb,
and domain-specific types. The core types are all part of the /type domain (which they share
with the value types), and the common types are all part of the /common domain. freebase.com
defines many domain-specific types, such as the music-related types /music/artist, /music/album
and /music/track. Figure 2.2 illustrates these type categories.

+--/type/id
|
+--/type/int
|
+--/type/float
|
+--/type/boolean
|
+--Value Types--+--/type/text
| |
| +--/type/rawstring
| | +--/restaurant domain
| +--/type/uri |
| | +--/location domain
| +--/type/datetime |
| | +--/film domain +-/music/track
Types-+ +--/type/key | |
| +--/music domain--+-/music/album
| | |
| +--Freebase Types-----+--/book domain +-/music/artist
| | |
| | +--etc.
| |
+--Object Types-+--Core Types (/type domain)
|
+--Common Types (/common domain)
|
+--User-defined Types

Figure 2.2. Categories of Metaweb types

The sub-sections that follow introduce the most important core and common types. You do not
need to understand these types in detail in order to make productive use of Metaweb. Still,
knowing what these basic types are is a helpful orientation to the system.

Chapter 2. Metaweb Architecture 13


2.3.1. Core Types
Types, Properties, domains and namespaces are fundamental to the Metaweb architecture, but
are represented by ordinary Metaweb types. These most fundamental types are described below.

2.3.1.1. /type/object
Earlier in this chapter, we explained that all Metaweb objects share a set of common properties:
name, id, key and so on. These universal object properties are defined by a core type named
/type/object. If you are an object-oriented programmer familiar with languages such as Java,
you might guess that /type/object is the root of the type hierarchy, and that it is the superclass
of all other object types.

In fact, however, Metaweb does not have a type hierarchy. Types do not have supertypes.
/type/object is not a normal type. Objects are never declared to be instances of this type. Re-
member that one of the common object properties is type: it specifies a set of types for the object.
/type/object never needs to be a member of this set. In fact, an object's set of types can be
empty, and the object will still have all of the common properties. The /type/object type exists
simply as a convenient placeholder. It serves to group the /type/property objects that represent
the common object properties.

2.3.1.2. /type/type
This type describes a type, which means that it is the only type that is an instance of itself. Types
have five properties:

properties The set of properties defined by the type.

instance The set of instances of the type. For commonly used properties, this
set may obviously grow quite large. Recall, however that all relation-
ship between objects in Metaweb are inherently bi-directional. Since
every object has a type property that refers to its type, it follows that
every type has a set of incoming links from its instances. Thus, every
type automatically maintains a set of its instances.

domain The domain to which the type belongs

expected_by The set of properties whose value is of the type

default_property The name of the default property for the type. When you ask Metaweb
to return an object as if it were a primitive value, Metaweb returns the
value of the default property for that type. For value types, the default
property is value. For most object types the default property is name.
And for core types in the /type domain, the default property is id.

2.3.1.3. /type/property
Every type defines a set of properties for its instances. The members of this set are /type/property
objects. The common name and key properties of a property object specify the human-readable

14 Developing Metaweb-Enabled Web Applications


and fully-qualified names for the property. In addition, properties specific to /type/property
specify:

• The expected type of the value of the property

• Whether the property is unique. A unique property may only have a single value (or may have
no value). A property that is not unique has a set of zero or more values.

• The reciprocal property, if there is one.

• The type of which this property is a part.

The notion of a reciprocal property deserves more explanation. Recall that all links in Metaweb
are bi-directional. This means that any time a property of type A refers to an object of type B
Metaweb automatically has a link from that object of type B back to the originating object of
type A. Type B can take advantage of this bi-directionality and include a property that links back
to objects of type A. As a concrete example, consider the properties property of /type/type:
it specifies the set of properties for a type. Its reciprocal is the schema property of /type/property,
which specifies the type object (or "schema") of which the property is a part. You'll find further
exploration of reciprocal properties in Chapter 5.

2.3.1.4. /type/domain
A domain represents a set of related types, and also serves as a namespace for those types. For
access control purposes, each domain object refers to one or more usergroup objects that "own"
the domain. Only members of the specified usergroups are allowed to add new types to the domain
or to edit types within the domain.

2.3.1.5. /type/namespace
This type represents a namespace, and is used by the value type /type/key. It defines the keys
property which is a set of /type/key values that specify the names in the namespace.

2.3.2. Content Types


The following types from the /type and /common domains are important content-related types:

/type/content
Large chunks of content, such as HTML documents and graphical images are not stored in
regular Metaweb nodes. Instead, these large objects (sometimes called lobs) are kept in a
separate store. A /type/content object is the bridge between the Metaweb object store and
the Metaweb content store. A /type/content object represents an entry in the content store,
and the guid of the /type/content object is used as an index for retrieving the content.

In addition to providing access to the content store, /type/content defines important prop-
erties. The media_type property specifies the MIME type of the content. For textual content,
the text_encoding and language properties specify the encoding and language of the text.
The length property specifies the size (in bytes) of the content. The source property refers
to a /type/content_import object that specifies the source of the content.

Chapter 2. Metaweb Architecture 15


Chapter 4 shows how to download content from Metaweb, and Chapter 6 demonstrates how
to upload content.

/type/content_import
This type describes the source of imported content. Its properties include the URI or filename
from which the content was obtained, the user who imported the content, and a timestamp
that specifies when the content was imported.

/type/media_type
Instances of this type represent a MIME media type such as "text/html" or "image/png". In-
stances are given fully-qualified names within the /media_type namespace, and can be spe-
cified with ids like /media_type/text/html or /media_type/image/png.

/type/text_encoding
Instances of this type represent standard text encodings, such as ASCII and Unicode UTF-8.
Instances are given fully-qualified names within the /media_type/text_encoding namespace,
and can be specified with ids such as /media_type/text_encoding/ascii.

/type/lang
This type represents a human language. It is used by /type/content objects and also by
/type/text values. Pre-defined instances of this type are given fully-qualified names within
the /lang namespace, and can be specified with ids like /lang/en and /lang/fr.

/common/topic
As described earlier in this chapter, Metaweb objects that are intended for display to end
users are called "topics". Such objects typically have some appropriate domain-specific type,
such as /music/artist or /food/restaurant, but are also instances of the type /common/top-
ic. This type defines properties that allow documents and images to be associated with the
topic. Another property allows a set of URLs to be associated with the topic. Also, because
objects can only have a single name in any given language, /common/topic has an alias
property that allows any number of nicknames to be specified for the topic.

/common/document
This type represents a document of some sort. /common/topic uses this type to associate
documents with topics. The most important property is content, which specifies the single
/type/content object that refers to the document content. Other properties of /common/doc-
ument provide meta-information about the document, such as authors, publication date, and
so on.

/common/image
/type/content objects that represent images are typically co-typed with this type. /common/im-
age defines a size property that specifies the pixel dimensions of the image.

2.3.3. Access Control Types


The following types are part of the Metaweb access control framework.

16 Developing Metaweb-Enabled Web Applications


/type/user
Each registered Metaweb user is represented with an object of /type/user. User objects have
fully-qualified names in the /user namespace. If your username is joe_developer, then your
/type/user object is /user/joe_developer.

/type/usergroup
This type represents a set of users.

/type/permission
This type is the key to Metaweb access control. Its properties specify the set of objects that
require this permission for modifications, and also the set of usergroups that have the permis-
sion. See Section 2.6 for further details.

2.4. Domains
A domain is a Metaweb object of /type/domain. It represents a collection of related types. We've
already seen a number of types from the /type and /common domains. freebase.com pre-defines
types in a number of general domains, and Chapter 3 and Chapter 4 feature many examples using
the Freebase /music domain. The set of Freebase domains is expected to grow, but at the time
of this writing, it includes:

/business /food /measurement_unit


/education /language /music
/film /location

As you might guess from the names of these domains, domain objects are also instances of
/type/namespace, and the types contained by domains are members of both the domain and the
namespace.

Every Metaweb user who registers for an account has their own domain. If your Metaweb username
is fred, then your domain is /user/fred/default_domain. When you use the freebase.com client
to define a new type named Beer, it is given the id /user/fred/default_domain/beer. If your
type becomes an important and commonly used one, it may be promoted by Metaweb adminis-
trators to a top-level domain. In this case, your type might be given a new fully-qualified name
like /zymurgy/beer.

2.5. Namespaces
Namespaces are a critical part of the Metaweb infrastructure because they allow us to refer to
important objects, such as types, with simple mnemonic names rather than opaque guids. It would
be ve ry inc onvenie n t t o q u ery M et aweb i f we h ad t o w r i t e
"#9202a8c04000641f8000000000000565" instead of "/common/topic", for example.

We've already learned about a number of important namespaces including /type, /user, /lang,
and /media_type. In addition to these, each domain and user object is also a namespace. Also,
there is the root namespace, whose id is simply /.

Chapter 2. Metaweb Architecture 17


A number of important namespaces are populated with pre-defined objects using names defined
by international standards. The languages in the /lang namespace use language codes (such as
"en" for English and "fr" for French) defined by ISO 639. The media types in /media_type are
defined by IANA and listed at http://www.iana.org/assignments/media-types/. And the text en-
codings in /media_type/text_encoding use names defined by IANA at http://www.iana.org/as-
signments/character-sets.

2.6. Access Control


Metaweb is completely open for reading. Anyone who can connect to Metaweb servers can read
data from them. When adding or editing data, however access control comes into play. We've
already seen that the types /type/user, /type/usergroup, and /type/permission are used for
access control.

Metaweb's access control model is quite simple. Every object has a permission property that
refers to a /type/permission object. The permission object specifies a set of usergroups whose
members have permission to modify the object. If a user is a member of one or more of the spe-
cified groups, then that user can edit the object. Otherwise, the user is not allowed to.

This simple access control model is, by default, also very open. In order to allow and encourage
free collaboration most Metaweb objects have a permission object that gives edit permission to
all Metaweb users. If Metaweb user Fred creates a new object, his friend Jill can freely edit that
object. Any other Metaweb user can edit the object as well, and there is no way for Fred to restrict
the permission on his object.

The primary exception to this open access control model is type objects. Having a stable type
system is very important. Each domain has a usergroup associated with it, and only members of
that usergroup can create new types in the domain or alter existing types in the domain. Each
user account has an associated domain. Fred's domain is /user/fred/default_domain. This
domain has an associated usergroup. Initially, Fred is the only member of this group. He is allowed
to add to the usergroup, and if he adds his friend Jill, then she is permitted to create new types
in Fred's domain.

Other key parts of the Metaweb infrastructure also have restrictive access control, of course.
Ordinary users are not allowed to insert objects into the /lang namespace or the /type domain,
for example.

18 Developing Metaweb-Enabled Web Applications


Chapter 3.The Metaweb Query Language
This chapter explains the Metaweb Query Language, or MQL, which is used to express Metaweb
queries. This chapter begins and ends with formal rules of MQL syntax, but the middle is an ex-
tended tutorial that teaches MQL by example. You are expected and encouraged to run queries
and to experiment with your own queries, using a "query editor" program that submits your
queries to Metaweb and displays the results.

This chapter teaches you to write MQL queries, but does not explain how to issue those queries
to and retrieve responses from Metaweb servers : that is the topic of Chapter 4. Also, this chapter
does not cover updates, or writes, to Metaweb. Updates are expressed using a variant of MQL
that is covered in Chapter 5.

3.1. JavaScript Object Notation


The Metaweb queries and responses we saw in Chapter 1 contained a lot of punctuation: curly
braces, quotation marks, colons, and commas. Before we study more queries, it is important to
understand this punctuation. Metaweb queries and responses use a plain-text data interchange
format known as JavaScript Object Notation or, more commonly, JSON. If you are a JavaScript
programmer, then this format will be familiar to you since it is a subset of the JavaScript language.
1
If you are not a JavaScript programmer, the format is easy-to-learn, and does not require the
use of the JavaScript language.

JSON is formally described in RFC 4627 (http://www.ietf.org/rfc/rfc4627.txt), and is also docu-


mented at http://json.org. The JSON website includes pointers to code, in a variety of programming
languages, for serializing data structures into JSON format and for parsing JSON text into data
structures. 2

Figure 3.1. JSON Values

1
You should read this section even if you already know JavaScript. JSON is only a subset of JavaScript, and its syntax is stricter than
JavaScript syntax.
2
The JSON syntax diagrams that appear below are also from the JSON website, where they have been placed in the public domain.

19
A JSON-formatted string is a serialized form of an array or object. The array or object may contain
numbers, strings, other arrays and objects, and the literal values null, true, and false. These
JSON values are illustrated in Figure 3.1 and explained in the sub-sections that follow:

3.1.1. JSON Literals: null, true, false


JSON supports three literal values. null is a JSON value representing "no value". The literals
true and false a represent the two possible Boolean values.

3.1.2. JSON Numbers


A JSON number consists of an optional minus sign followed by an integer part followed by an
optional decimal point and fractional part followed by an optional exponent. This format is the
same as the format described for /type/float in Chapter 2. All numbers use decimal digits: octal
and hexadecimal notation are not supported.

3.1.3. JSON Strings


3
A JSON string is much like a string in Java or JavaScript: zero or more Unicode characters
between double quotation marks. See Figure 3.2.

Figure 3.2. JSON string syntax

3
JSON itself supports 32-bit, 16-bit and 8-bit encodings of Unicode text. Metaweb, however, requires the 8-bit UTF-8 encoding.

20 Developing Metaweb-Enabled Web Applications


A backslash is special: it is an escape character and is interpreted along with the character or
characters that follow:

Escape Character
\" A quotation mark that does not terminate the string
\\ A single backslash character that is not an escape
\/ A forward slash character. Although it is legal to escape the forward slash character, it
is never necessary to do so.
\b The Backspace character
\f The Formfeed character
\n The Newline character
\r The Carriage Return character
\t The Tab character
\uXXXX The Unicode character whose encoding is the four hexadecimal digits XXXX. To encode
extended Unicode codepoints that do not fit in four hex digits, use two \u escapes to
encode a UTF-16 surrogate pair.

3.1.4. JSON Arrays


An array is a comma-separated list of JSON values enclosed in square brackets. See Figure 3.3

Arrays may contain any JSON values, including objects and other arrays. The elements of a JSON
array need not have the same type (though in MQL they always do). The following JSON array
might be returned in response to a MQL query:

["Outlandos d'Amour", "Reggatta de Blanc", "Zenyatta Mondatta"]

Figure 3.3. JSON array syntax

A JSON array with no elements consists of just the square brackets: []. Empty arrays often appear
in MQL queries.

Chapter 3. The Metaweb Query Language 21


3.1.5. JSON Objects
A JSON object is named after the JavaScript object type, and is not very much like the objects
of strongly-typed object-oriented programming languages. Instead, think of an object as:

• an associative array;

• a hashtable that maps strings to values;

• a dictionary; or

• an unordered set of named values.

JSON objects are written as a comma-separated list of name/value pairs, enclosed in curly braces.
A name/value pair is a JSON string (the name) followed by a colon followed by any JSON value,
which may include nested objects and arrays. See Figure 3.4

Figure 3.4. JSON object syntax

Here is an example JSON object (which also happens to be a Metaweb query):

{
"type" : "/music/artist",
"name" : "The Police",
"album" : []
}

JavaScript programmers should note that JSON requires property names to appear within double
quotes, even though the JavaScript language does not. Arbitrary whitespace is allowed within
JSON objects and arrays, but trailing commas (after the final array element or last name/value
pair) are not. An empty JSON object, with no properties at all is simply a pair of curly braces:
{}. As we'll see, empty objects are not uncommon in MQL queries.

22 Developing Metaweb-Enabled Web Applications


3.2. MQL Tutorial
This section is a tutorial that teaches Metaweb queries by example, and uses freebase.com as a
source of interesting data to query. Try to follow along as you read it by trying out the queries
presented. To do this, you need a simple way to submit a query to freebase.com and view the
result. You can do this with the Freebase query editor at http://www.freebase.com/view/queryed-
itor/, or you can create your own simple query editor: save the code from Example 3.1 to a local
file, and view it in your web browser. Figure 3.5 shows the resulting UI, displaying a simple
query and response. (Remember that during the roll-out period, freebase.com queries require
cookie-based authentication. So the query editor of Example 3.1 will only work in web browsers
that have previously logged on to freebase.com and have authentication credentials stored in a
cookie.

Figure 3.5. A simple Metaweb query editor

Example 3.1. qedit.html: Code for a Metaweb query editor

<html>
<head><title>Metaweb Query Editor</title>
<style> /* CSS styles for nice output */
#q, #r { width: 400px; height: 300px; border-width:0px;padding:5px;}
th {background-color:black; color:white; font:bold 12pt sans-serif;}
td.border,th {border:solid black 3px;}
input { margin: 5px; font-weight: bold; }
table { border-collapse: collapse;}
</style>
</head>
<body>
<!-- Form makes an HTTP GET request to mqlread, results go in iframe r -->

Chapter 3. The Metaweb Query Language 23


<form action="http://www.freebase.com/api/service/mqlread"
method="get"
target="r">
<!-- Force mqlread to return text/javascript instead of application/json -->
<input type="hidden" name="callback" value=" ">
<!-- An HTML table to display the query, buttons, and result -->
<table>
<tr><th>Query</th><td></td><th>Result</th></tr> <!-- table header -->
<tr>
<td class="border">
<!-- enter query -->
<textarea name="queries" id="q">
{
"qname": {
"query": [{
enter query here
}]
}
}
</textarea></td>
<td valign="top">
<input type="submit" value="Send"><br> <!-- send query -->
<input type="reset" value="Erase"></td> <!-- erase query -->
<td class="border">
<iframe name="r" id="r"></iframe></td> <!-- results go here -->
</tr>
</table>
</body>
</html>

A Note about Query Envelopes

A MQL query is a JSON object. In order to get a Metaweb server to execute the query,
however, you must nest it inside two more JSON objects, which are known as the "envelope".
We'll explain envelopes in detail in Section 4.2.3 when we're explaining how to send MQL
queries to Metaweb. For now, however, you just need to know enough about envelopes so
that you can try out queries in query editors.

In order to execute a query in the query editor of Example 3.1, you must put this text before
your query:

{"qname":{"query":

The text "qname" is actually an arbitrary name for the query; you can use any name you
want here. The text "query" must be entered verbatim. Whitespace is optional in JSON, so
you can of course insert spaces and newlines into that text to make it look nice. Since the
envelope is a JSON object, you must provide the matching closing braces. This means that
you must follow your query with:

}}

24 Developing Metaweb-Enabled Web Applications


The Freebase query editor at http://www.freebase.com/view/queryeditor/ is a little easier.
When you use it, you omit the outer part of the envelope, and just begin the query with
{"query": and end it with }.

Metaweb's response to MQL queries are also JSON objects, and like the queries are wrapped
within envelope objects. If you use the query editor in Example 3.1, you'll see the complete
response envelope object, and you'll find the MQL query result as the value of a property
named result in an object that is itself the value of a property named qname (or whatever
query name you choose in the query envelope). If you use the Freebase query editor, you
have the option of viewing just the result or the complete response envelope. In this chapter,
we omit the response envelopes except when we need to look at error messages that appear
in the envelope.

3.2.1. Our First Query


Let's begin by revisiting the simple query from Chapter 1. We would like to know what albums
The Police have recorded:

{
"type" : "/music/artist",
"name" : "The Police",
"album" : []
}

Once we extract the result from its envelope, we're left with the following JSON object (some
of the album names are omitted here for brevity):

{
"type": "/music/artist",
"name": "The Police",
"album": [
"Outlandos d'Amour",
"Reggatta de Blanc",
"Zenyatta Mondatta",
"Ghost in the Machine",
"Synchronicity",
]
}

To query Metaweb we tell it what we already know by specifying properties and their values:

"type" : "/music/artist",
"name" : "The Police",

And then we tell it what we want to know by specifying properties without values:

"album" : []

Sending an empty array in a MQL query tells Metaweb that we'd like to have the array filled in.

Chapter 3. The Metaweb Query Language 25


Singular or Plural?

Note that the property we query in the example above is named "album" and not "albums",
even though bands like The Police may well have more than one album. This reflects the
underlying nature of Metaweb: the database object that represents The Police has many
links of type "album" that refer to the objects that represent those albums. The type /mu-
sic/artist aggregates these many links into a single set of albums, but retains the singular
name "album" because that is the name of the underlying link type.

Also, as we'll see soon when you want to obtain information (such as a list of tracks) about
one particular album, you specify a single value (instead of an array) for the album property.
In this case, the singular name makes a lot of sense.

Although property names are typically singular, there are exceptions to this rule, and
sometimes you'll see a plural property name.

3.2.2. Query/Response Symmetry


Let's look one more time at the simple "albums by The Police" query and response from above.
This time the query and response are presented side-by-side to emphasize that the query and re-
sponse objects have the same properties, but the response object has values filled in:

Query Result
{ {
"type" : "/music/artist", "type": "/music/artist",
"name" : "The Police", "name": "The Police",
"album" : [] "album": [
} "Outlandos d'Amour",
"Live in Boston",
"Reggatta de Blanc",
"Zenyatta Mondatta",
"Ghost in the Machine",
"Synchronicity",
"Every Breath You Take: The Singles",
"Greatest Hits"
]
}

This symmetry of queries and responses is a fundamental and elegant part of MQL. We'll use
this two-column query/response format throughout the chapter.

3.2.3. Metaweb Object IDs


Recall that all Metaweb objects have a unique identifier in their id property. Here's how we find
the id for The Police:

26 Developing Metaweb-Enabled Web Applications


Query Result
{ {
"type" : "/music/artist", "type" : "/music/artist",
"name" : "The Police", "name" : "The Police",
"id" : null "id" : "#1f800000000006df1b",
} }

This query includes the same name and type as the last query. But instead of specifying an empty
array of albums, it specifies a null id. The null value is our query: this is what we want Metaweb
to fill in. The response looks just like the query, but the null is replaced with an id string.

Freebase IDs in this Tutorial

The ids shown in the online HTML-formatted version of this tutorial are valid freebase.com
guids, and should remain valid in perpetuity. In hardcopy and PDF versions of the tutorial,
guids have been shortened to allow queries and responses to fit side-by-side in the two-
column format shown above. To use these printed guids online, restore them to validity by
inserting 9202a8c0400064 between the # and the digit that follows it.

3.2.4. Multiple Results and Uniqueness Errors


Now that we know the id of the object, let's turn our query around and ask about name and type
of the object with that id:

{
"id": "#1f800000000006df1b",
"name" : null,
"type" : null
}

We're telling Metaweb what we have (the id) and asking for the values (name and type) that we
don't have. When we submit this query, though, it doesn't work. The response envelope looks
like this:

{
"status": "200 OK",
"qname": {
"status": "/mql/status/error",
"messages": [
{
"status": "/mql/status/result_error",
"info": {
"count": 2,
"result": [
"#1f80000000011ae833",
"#1f8000000000000565"
]
},

Chapter 3. The Metaweb Query Language 27


"path": "type",
"query": {
"error_inside": "type",
"type": null,
"id": "#1f800000000006df1b",
"name": null
},
"message": "Unique query may have at most one result. Got 2",
"type": "/mql/error"
}
]
}
}

The various status properties tell us that something is wrong with the query. The messages[0]
object provides details. Its message property gives us an error message, and its info object
provides details to go with the message. The query.error_inside and path properties tell us
that the error is associated with the type property in our query.

What we learn from this response is that Metaweb could not respond to our query because we
asked for a single type and it found two types. Let's try the query again. Now we're requesting a
single name and an array of types for this uniquely specified object. This query works:

Query Result
{ {
"id":"#1f800000000006df1b", "id":"#1f800000000006df1b",
"name" : null, "name" : "The Police",
"type" : [] "type" : [
} "/common/topic",
"/music/artist"
]
}

The Metaweb object we asked about has the name "The Police" and it is a member of two types:
/common/topic and /music/artist. Recall from Chapter 2 that /common/topic is a very generic
type. Just about every Metaweb object that represents something an end user would have an interest
in is a member of this type. The lesson to draw here is that objects almost always have more than
one type, and any queries on the type property should use arrays. In general, it is always safe to
use [] in place of null in your queries. If there is only one result the array returned in the response
will simply have a single element. When you know that there can only be one result, however,
it is usually more convenient and efficient to use null.

Single Values and Sets of Values

There is a fundamental asymmetry to MQL: when we query the type of an object, we get
an array of types. But when we look up an object by type, we specify only one type. Metaweb
objects have a set of types, not one single type. So when we specify the type of an object
in a MQL query, all we are saying is that the object has at least one "type" link with that

28 Developing Metaweb-Enabled Web Applications


value. Thus, writing "type":"/music/artist" in a query does not say "the type is /mu-
sic/artist", but "the set of types includes /music/artist". Put another way, we can say that a
query provides constraints, and that the response provides values for the unconstrained
properties of the query.

Uniqueness errors are a common pitfall for developers crafting Metaweb queries. Recall that
/type/property allows certain properties to be specified as unique. id is unique: no object can
have more than one id. The name property behaves as if it is unique (but is only unique per lan-
guage). As we've seen, however, the type property, is not unique: objects can (and most objects
do) have more than one type. If a property is not guaranteed to be unique, then you should always
use square brackets when querying its value.

The id property is unique in another way. As we've seen, no object can have more than one id.
More importantly, however, no two objects share the same id. Therefore, if a query includes an
id, you can be confident that no more than one object will match. Therefore, a query like this
one is correct:

{
"id": "#1f800000000006df1b",
"name" : null,
"type" : []
}

Recall that an object can have only one name in any given language, and that the name property
behaves like a unique property even though it is not really. For this reason, it is always safe to
query name with null, as we do above, rather than [].

On the other hand, the query that we started this tutorial with is risky:

{
"type" : "/music/artist",
"name" : "The Police",
"album" : []
}

This query worked for us: Freebase only knows about one musical artist named "The Police".
Note, however, that there is no guarantee that this will always be the case. There is nothing to
prevent someone from adding another band named "The Police" to freebase.com. If such an ad-
dition were made, our query would suddenly fail.

Depending on the design of your application, a uniqueness failure in this situation might actually
be exactly what you want. If you get two results when you expected one, then perhaps the right
thing to do is fail and display an error message to the user. On the other hand, you could write
your query more cautiously, using square brackets, so that multiple results can be returned:

[{
"type" : "/music/artist",
"name" : "The Police",
"album" : []
}]

Chapter 3. The Metaweb Query Language 29


In this case, you should be sure to check the number of results returned and take appropriate action
(such as asking the user to choose) if you get more than one.

3.2.5. Nested Queries


Let's find out more about our favorite band. What are the names of the tracks on the album Syn-
chronicity?

Query Result
{ {
"type" : "/music/artist", "type" : "/music/artist",
"name" : "The Police", "name" : "The Police",
"album" : { "album" : {
"name" : "Synchronicity", "name" : "Synchronicity",
"track" : [] "track" : [
} "Synchronicity I",
} "Walking in Your Footsteps",
"O My God",
"Mother",
"Miss Gradenko",
"Synchronicity II",
"Every Breath You Take",
"King of Pain",
"Wrapped Around Your Finger",
"Tea in the Sahara",
"Murder by Numbers"
]
}
}

The interesting thing about this query is that it includes a nested query. We're asking for an array
of tracks from an album named "Synchronicity" recorded by a band named "The Police".

There are other ways to obtain the same information. Here's another query that gets us the same
data:

{
"type" : "/music/album",
"name" : "Synchronicity",
"artist" : "The Police",
"track" : []
}

Rather than identifying the band first, and then querying an album recorded by that band, this
query goes straight to the album, which it identifies by name and by artist. (It assumes this is
enough to uniquely identify a single album and avoid uniqueness errors!)

30 Developing Metaweb-Enabled Web Applications


3.2.6. Asking Metaweb For Objects
In our queries so far, we've used null and [] to ask Metaweb to fill in a single value or an array
of values. There are other ways to ask for information as well. Recall the following query:

{
"id" : "#1f800000000006df1b",
"name" : null,
"type" : []
}

It asks for the name and types of a unique object. Both the name, and the individual elements of
the type array are returned as strings. Recall, however, that the name of an object is of /type/text
and that types are of /type/type. /type/text is a value type in the Metaweb object model, but
we can treat values as objects if we want to. Let's modify the query to use {} and [{}] instead
of null and []. {} asks for a single value, expanded as an object, and [{}] asks for an array of
values expanded into objects:

{
"id": "#1f800000000006df1b",
"name" : {},
"type" : [{}]
}

This query fails with a uniqueness error. The object we're querying has more than one name. The
name property behaves specially when queried with null: it returns the value of the name in the
default language. It only works to query name with {} if there is only one name, with no transla-
tions. To make the query work, we ask for both the name and type with [{}]:

Query Result
{ {
"id": "#1f800000000006df1b", "id":"#1f800000000006df1b",
"name" : [{}], "name":[{
"type" : [{}] "lang":"/lang/fr",
} "type":"/type/text",
"value":"The Police"
},{
"lang":"/lang/en",
"type":"/type/text",
"value":"The Police"
},{
"lang":"/lang/es",
"type":"/type/text",
"value":"The Police"
}],
"type":[{
"id":"/music/artist",
"name":"Musical Artist",
"type":["/type/type","/freebase/type_profile"]
},{

Chapter 3. The Metaweb Query Language 31


Query Result
"id":"/common/topic",
"name":"Topic",
"type":["/type/type","/freebase/type_profile"]
}]
}

We learn from this query that the name of the specified object is "The Police" in each of several
different languages (some languages have been omitted here). We also learn that the object is of
type /common/topic and /music/artist and that these types have common names (as opposed
to the formal ids that we use in queries) "Topic" and "Musical artist".

Let's use this query technique to learn more about the tracks on the album Synchronicity. (The
result is truncated for brevity.)

Query Result
{ {
"type" : "/music/album", "type": "/music/album",
"name" : "Synchronicity", "name": "Synchronicity",
"artist" : "The Police", "artist": "The Police"
"track" : [{}] "track": [{
} "type": [ "/music/track" ],
"name": "Synchronicity I",
"id": "#1f8000000001275dbb"
},{
"type": [ "/music/track" ],
"name": "Walking in Your Footsteps",
"id": "#1f8000000001275dc2"
},{
"type": [ "/music/track" ],
"name": "O My God",
"id": "#1f8000000001275dc9"
}]
}

This query doesn't actually tell us much about the tracks themselves. We already know the type
of the tracks. The id might be useful in future queries, but it doesn't tell us anything about the
track. The name is useful, but we could have obtained that without using curly braces, just by
querying "track":[].

When you ask Metaweb to fill in empty curly braces for you, it returns all the properties if the
value is a value type. The name property of an object is of /type/text, and querying it with {}
returns all of its properties. If the property is an object type instead of a value type, then Metaweb
returns only the name, type and id properties (all of which are defined by /type/object and are
common to all Metaweb objects). That is, instead of using [{}], we could write out the query
explicitly like this:

32 Developing Metaweb-Enabled Web Applications


{
"type" : "/music/album",
"name" : "Synchronicity",
"artist" : "The Police",
"track" : [{
"name" : null,
"id" : null,
"type" : []
}]
}

What if we want to know absolutely everything freebase.com knows about the tracks on Synchron-
icity? We write the query using a wildcard: 4

{
"type" : "/music/album",
"name" : "Synchronicity",
"artist" : "The Police",
"track" : [{"*":null}]
}

"*" is a wildcard property name. It means "all property names". (Note that it is different from []
which means "all property values") The type /music/track defines a number of its own properties,
and the expansion of the "*" wildcard also includes the universal properties defined by
/type/object. Here, for example, is what freebase.com knows about the song "Walking in Your
Footsteps":

{
"name":"Walking in Your Footsteps",
"type":["/music/track"],
"id":"#1f8000000001275dc2",
"guid":"#1f8000000001275dc2",
"creator":"/user/mwcl_musicbrainz",
"key":["a2313ee6-ccce-4ced-bc3c-af7d4b06f09f","TRACK179899"],
"permission":"/boot/all_permission",
"timestamp":"2006-12-10T00:39:58.0931Z",
"album":["Synchronicity"],
"length":[216.8],
"lyricist":[],
"lyrics":[],
"song":[],
"artist":null,
"composer":[],
"acquire_webpage":[]
}

If {} gives us too little useful information, And {"*":null} gives us more than we really need,
then we must refine our query to express exactly what it is we would like to know. Here's how
we ask for just the name and length of each of the tracks:

4
We'll return to the topic of wildcards later in this tutorial.

Chapter 3. The Metaweb Query Language 33


Query Result
{ {
"type" : "/music/album", "type": "/music/album",
"name" : "Synchronicity", "name": "Synchronicity",
"artist" : "The Police", "artist": "The Police",
"track" : [{ "track": [
"name":null, {"name":"Synchronicity I", "length":203.533},
"length":null {"name":"Walking in Your Footsteps",
}] "length":216.8},
} {"name":"O My God", "length":242.226},
{"name":"Mother", "length":185.6},
{"name":"Miss Gradenko", "length":120.0},
{"name":"Synchronicity II", "length":305.066},
{"name":"Every Breath You Take",
"length":254.066},
{"name":"King of Pain", "length":299.066},
{"name":"Wrapped Around Your Finger",
"length":313.733},
{"name":"Tea in the Sahara", "length":255.44},
{"name":"Murder by Numbers", "length":273.693}
]
}

3.2.7. Expanded Values and Default Properties


In this tutorial we've said that we query the value of a property p with "p":null and "expand"
that value into an object with "p":{}. This is helpful terminology, but it is actually the opposite
of what is really going on. Everything in Metaweb is an object (or, in the case of literal values,
can be viewed as an object). When you use curly braces, objects are naturally expressed as objects.
When you use null, however, objects are compressed: instead of returning the complete object,
Metaweb returns only the value of the object's default property. If the object is of value type, this
default property is always the value property and is expressed as a string, number, or boolean
literal. If the object is not an instance of a value type, then the default property is either name or
id, both of which are expressed using string literals. Object types in the /type domain use id as
their default property. All others object types use name.

Default properties are not only used when you ask Metaweb to fill in a null or a [] for you. They
are also used when you express the information you already have. Consider the following query:

{
"type" : "/music/album",
"name" : "Synchronicity",
"artist" : "The Police",
"track" : []
}

This query could also be expressed more verbosely like this:

{
"type" : "/music/album",

34 Developing Metaweb-Enabled Web Applications


"name" : {"value":"Synchronicity", "lang":"/lang/en"},
"artist" : {"type":"/music/artist", "name":"The Police"},
"track" : []
}

The verbose form of the query illustrates the fact that the succinct form relies on default properties.
The name property is of /type/text, whose default property is value. The artist property is of
type /music/artist, whose default property is name.

3.2.8. Review: Asking for Values


If you want to ask Metaweb to return a value, use one of the terms listed in Table 3.1 on the right-
hand side of a property name:

Table 3.1. Asking for Values


Term Meaning
null If the property is of value type, return the value property. If the property is of
object type, return the name or id property. The default_property property of
/type/type specifies which property is returned for every object type.
[] Like null, but return an array of values instead of a single value.
{} If the property is of value type, return an object that represents the value. This
object will have type and value properties. If the property is /type/text, the
returned object will also have a lang property, and if it is of /type/key, it will
have a namespace property.

If the property is of object type, return an object that includes its name, id, and
type properties. In this case, the term {} is equivalent to:
{"name":null,"id":null,"type":[]}.
[{}] Like {}, but return an array of objects instead of a single one.
{"*":null} A query of this form returns an object and all of its properties. The meaning of
"all of its properties" requires some explanation, however. Suppose Metaweb
sees the query "p":{"*":null}. It looks up the property p and determines that
its expected type is t. Then it looks up the type t, and determines what properties
that type defines. Then it expands the wildcard query so that each property of t
(plus the properties of /type/object) is queried with either null or [] depending
on whether it is unique or not.
[{"*":null}] Like {"*":null}, but return an array of objects instead of a single one.
{"*":{}} A query of this form is like {"*":null}, except that when the wildcard is expan-
ded, each property is queried with {} or [{}] instead of null or [].
[{"*":{}}] Like {"*":{}}, but return an array of objects instead of a single one.

Chapter 3. The Metaweb Query Language 35


3.2.9. Too Much Information
Suppose we want to find the answers to the following questions: Which artists have recorded
songs named Too Much Information? How long are the recordings, and on what albums were
they released?

Here's a simple query to answer this question, along with the freebase.com response:

Query Result
[{ [{
"type":"/music/track", "type" : "/music/track",
"name":"Too Much Information", "name" : "Too Much Information",
"artist":null, "artist" : "The Police",
"album":null, "album" : "Message in a Box (disc 3)",
"length":null "length" : 222.733
}] },{
"type" : "/music/track",
"name" : "Too Much Information",
"artist" : "The Police",
"album" : "Ghost in the Machine",
"length" : 222.733
}]

You should have no trouble understanding this query. It requests an array of tracks with the
specified name, and asks Metaweb to fill in the artist, album, and length of each track. But there
are other ways to ask for this information. The above track-centric query is simple, but returns
an unordered and unstructured list of tracks. If multiple artists have recorded the same song, we
might like the result to be organized by artist. Here's how to write an artist-centric version of the
query, along with the more structured response from freebase.com:

Query Result
[{ [{
"type":"/music/artist", "type" : "/music/artist",
"name":null, "name" : "The Police",
"album": [{ "album" : [{
"name":null, "name" : "Ghost in the Machine",
"track": [{ "track" : [{
"name":"Too Much Information", "name" : "Too Much Information",
"length": null "length" : 222.733
}] }]
}] }, {
}] "name" : "Message in a Box (disc 3)",
"track" : [{
"name" : "Too Much Information",
"length" : 222.733
}]
}]
}]

36 Developing Metaweb-Enabled Web Applications


Take a look at that query again. It involves three different objects: an album, and artist, and a
track. We can't tell Metaweb anything interesting about the album (such as a name or id): just
that it contains the song we're interested in. We can't tell Metaweb anything about the artist object
either: just that they recorded an album that includes the song. Despite the seeming vagueness
of this query, Metaweb has no trouble finding the answer we want.

At first glance, it seems as if the only information we're providing to Metaweb with this query
is the track name. But notice that we also explicitly specify the type of the outermost object:
we've said that we want an object of type /music/artist. This is critical, because types have
properties, and properties specify the type of their values. Since we've specified that the outermost
object is /music/artist, Metaweb knows that the middle object is a /music/album (because that
is the type of the /music/artist/album property) and that the inner object is a /music/track
(because that is the type of the /music/album/track property). 5

We've answered our question about the song Too Much Information with a track-centric query
and an artist-centric query. For completeness, here is the album-centric query that returns the
same information:

[{
"type":"/music/album",
"name":null,
"artist":null,
"track": [{
"name":"Too Much Information",
"length": null
}]
}]

3.2.10. The id and name Properties


The id and name properties of every Metaweb object have special behavior that is important to
understand. Some of this behavior was explained in Chapter 2, but it bears repeating here.

The critical thing about id is that it is unique: every object's id is different. For objects, such as
types, that are organized into namespaces, the id is a fully-qualified name such as "/music/artist".
For other objects, the id is a guid: a unique, but meaningless, string of hexadecimal digits. Note
that although ids are represented with JSON strings, the id property of /type/object is of
/type/id rather than /type/text or /type/rawstring.

In addition to its guarantees of uniqueness, the id property has some special behavior. Specifically,
the id property cannot be constrained with pattern-matching or comparison operators, and cannot
be used as a sort key. (We'll learn about operators and sorting later in this tutorial.)

The special thing about the name property is that it behaves like a unique property (you can safely
query it with null instead of [], for example) but it is not truly unique. Any Metaweb object can
have multiple names, but may have only one name in any given language. That is, the name
property is unique on a per-language basis. When you query the name of an object, Metaweb

5
Yes, properties have fully-qualified names that include the name of the type of which they are a part. We'll see example queries using
fully-qualified names later in this tutorial.

Chapter 3. The Metaweb Query Language 37


returns its name (if it has one) in your preferred language. (The desired language is specified as
a parameter to the mqlread query service, which is the topic of Chapter 4.)

To demonstrate the special behavior of the name property, we must choose a topic that has
translations into other languages. Let's find the freebase.com topic named "Anarchism":

Query Result
{ {
"type" : "/common/topic", "type": "/common/topic",
"name" : "Anarchism", "name": "Anarchism",
"id":null "id": "#1f8000000000003b60"
} }

Now, let's take this object identified by id, and ask for its name:

Query Result
{ {
"id":"#1f8000000000003b60", "id":"#1f8000000000003b60",
"name":null "name":"Anarchism"
} }

This simply returns the English name we started with: "Anarchism". Let's ask for all names:

Query Result
{ {
"id":"#1f8000000000003b60", "id":"#1f8000000000003b60",
"name":[] "name":["Anarchism"]
} }

This query just returns the unique English name in an array. So let's try again and ask for all
names, along with the languages in which they are encoded:

38 Developing Metaweb-Enabled Web Applications


Query Result
{ {
"id":"#1f8000000000003b60", "id" : "#1f8000000000003b60",
"name":[{}] "name" : [
} {"lang":"/lang/en","type":"/type/text",
"value":"Anarchism"},
{"lang":"/lang/es","type":"/type/text",
"value":"Anarquismo"},
{"lang":"/lang/fr","type":"/type/text",
"value":"Anarchisme"},
{"lang":"/lang/it","type":"/type/text",
"value":"Anarchismo"},
{"lang":"/lang/de","type":"/type/text",
"value":"Anarchismus"},
]
}

Bingo! We find that this object has names in English (en), Spanish (es), French (fr), Italian (it),
and German (de)

Here's how we can ask for a name of the object in a specific language other than our preferred
language:

Query Result
{ {
"id":"#1f8000000000003b60", "id" : "#1f8000000000003b60",
"name":{ "name": {
"value":null, "value": "Anarchisme",
"lang":"/lang/fr" "lang": "/lang/fr"
} }
} }

3.2.11. Numeric Constraints


We know how to ask "what are the names and lengths of the tracks on the album Synchronicity
by The Police?". The query looks like this:

{
"type":"/music/album",
"name":"Synchronicity",
"artist":"The Police",
"track":[{"name":null, "length":null}]
}

Metaweb also allows us to ask "What are the names and lengths of the long songs on the album?"
The query below includes a numeric constraint on the length property, and the freebase.com
response only includes the two songs on the album that are longer than 300 seconds:

Chapter 3. The Metaweb Query Language 39


Query Result
{ {
"type":"/music/album", "type" : "/music/album",
"name":"Synchronicity", "name" : "Synchronicity",
"artist":"The Police", "artist" : "The Police",
"track":[{ "track" : [{
"name":null, "name" : "Synchronicity II",
"length":null, "length" : 305.066
"length>":300 }, {
}] "name" : "Wrapped Around Your Finger",
} "length" : 313.733
}]
}

The line "length>":300 in the query expresses a constraint to Metaweb: it specifies that the track
must be longer than 300 seconds. In addition to >, you can also use < for less-than, and <= and
>= for less-than-or-equal and greater-than-or-equal. Note, however, that no spaces are allowed
before or after these punctuation characters.

This constraint syntax looks quite odd at first. It is a result of the limitations of the JSON format:
everything must be expressed with property names, colons, and values. We would like to be able
to express a constraint like:

"length" <= 300

But that is not legal JSON syntax, so we express it instead like this:

"length<=" : 300

You can include more than one numeric constraint on the same property, restricting the value to
a range. Here's how we ask for songs that are at least three minutes long, but less than four:

{
"type":"/music/album",
"name":"Synchronicity",
"artist":"The Police",
"track":[{
"name":null,
"length":null,
"length>=":180,
"length<":240
}]
}

If you include a constraint on a property, you must also ask Metaweb to return the value of that
property. You cannot, for example ask: "List all songs longer than 5 minutes, but don't bother to
tell me exactly how long they are."

40 Developing Metaweb-Enabled Web Applications


Numbers are not the only type that can be constrained with these operators. Here, for example,
is a query that constrains a /type/datetime property to obtain a list of albums released in 1999:

[{
"type":"/music/album",
"name":null,
"artist":null,
"release_date":null,
"release_date>=":"1999-01-01",
"release_date<=":"1999-12-31"
}]

3.2.12. Textual Constraints: Pattern Matching in


Queries
Metaweb queries can also place constraints on textual values. To do this use the pattern matching
operator ~=. 6 To try this out, let's find some short songs about love:

[{
"type":"/music/track",
"artist":null,
"name":null,
"name~=":"love",
"length":null,
"length<":120
}]

Here's a query for songs about love recorded by bands whose name begins with "The":

[{
"type":"/music/track",
"artist":null,
"artist~=":"^The",
"name":null,
"name~=":"love"
}]

Results include If You Love Somebody, Set Them Free by The Police and I'm Sick of Love by The
White Stripes.

Notice that the constraint on the artist property in the query above uses the ^ character to specify
that the word The must appear at the beginning of the artist's name. If you're familiar with regular
expressions, this might make you think that Metaweb supports pattern matching with regular
expressions. In fact, Metaweb's matching syntax is closer to that used by internet search engines.
Table 3.2 summarizes MQL pattern matching syntax. Note that all searches are case-insensitive.

6
If you've done programming with languages like Perl or Ruby, this syntax should look familiar. If you're not already familiar with
it, think of "~=" as meaning "approximately equal" or "like".

Chapter 3. The Metaweb Query Language 41


Table 3.2. MQL Pattern Matching Syntax
Pattern Matches
love Matches any string that contains the word "love". Does not match strings containing
"glove" or "lover".
love* Matches any string containing a word that begins with "love", such as "love", "lover"
or "lovely". Does not match "glove".
*love Matches any string containing a word that ends with "love", such as "love" or "glove".
*love* Matches any string that contains "love", such as "love", "glove", "lover" and "glover".
* Matches any single word
love you Matches any string that contains the phrase "love you". Does not match strings that
contain "glove you", "love your", "you love", "love hate you" or "loveyou".
^ Matches the beginning of a string. For example, ^the matches any string that begins
with the word "the", and ^the* matches any string that begins with word that begins
with "the", such as "they" or "there".
$ Matches the end of a string. For example, hits$ matches any string that ends with
the word "hits", and *love$ matches "Sunshine of your Love" and "Smell the Glove".
- A hyphen or other punctuation matches an optional space. For example, bi-direc-
tional matches "bi directional", "bi-directional", or "bidirectional".
\ Use a backslash to escape any punctuation character that you want to match literally.
\' matches any string that contains an apostrophe, for example. Note, however, that
JSON string literals require backslashes like this to be doubled. If you type a JSON
query "by hand" or use string manipulation techniques to create a query, be sure to
double the backslashes. If you use a JSON serializer to create the query, it should
double the backslashes for you.

Here's a query to find all bands whose name is two words long and begins with the word The
(such as The Police, and The Clash):

[{
"type" : "/music/artist",
"name" : null,
"name~=" : "^The *$"
}]

What bands have three-word names that begin with "the" and end with a plural (e.g. The Beach
Boys, The Doobie Brothers)?

[{
"type" : "/music/artist",
"name" : null,
"name~=" : "^The * *s$"
}]

In addition to matching text with ~=, string constraints can also be applied with the <, >, <=, and
>= operators, which compare strings in case-insensitive, Unicode-aware alphabetical order. For
example, to find bands whose name begins with one of the letters A through F, use this query:

42 Developing Metaweb-Enabled Web Applications


[{
"type" : "/music/artist",
"name" : null,
"name>=" : "A",
"name<" : "G"
}]

Note that it is not legal to constrain the id property, with either the pattern-matching operator or
the greater-than or less-than operators.

3.2.13. Limiting Queries


Every Metaweb query for a set of values is implicitly limited to 100 values -- to reduce resource
consumption and bandwidth usage, Metaweb does not return more values than this unless you
explicitly ask for more. If you ran the query above for bands whose name begins with the letters
A through F, you ran up against this limit. To change the number of desired results to a larger,
or a smaller, number, use the limit directive. Here, for example, is a query that returns the names
of up to 2000 bands:

[{
"type":"/music/artist",
"name":null,
"limit":2000
}]

limit is not a property name: it is a reserved word in MQL. No type may have a property named
"limit". Limits can be useful to prune the result tree of values you aren't really interested in. The
following query, for example, asks "What bands have names that begin with "The" and have re-
corded songs longer than 8 minutes? I'm only interested in the band name, so just give me one
of the long songs, not the full list."

[{
"type" : "/music/artist",
"name" : null,
"name~=" : "^The",
"track": [{
"name":null,
"length":null,
"length>":480,
"limit":1
}]
}]

Note that we use a limit of one in the above. Specifying a limit of zero means "don't limit the
results: return everything you've got". Although MQL allows you to ask for an unlimited number
of results, Metaweb does not guarantee that you'll always get an answer. Complicated queries
with a large number of results may time out before Metaweb can complete the result.

Chapter 3. The Metaweb Query Language 43


Since the limit directive must appear within curly braces, limiting a query sometimes requires
you to transform a simple query into a slightly more complex one. Consider this query to list all
albums by The Police:

{
"type" : "/music/artist",
"name" : "The Police",
"album" : []
}

If we want to limit the result to five albums, we must rewrite the query as follows:

{
"type" : "/music/artist",
"name" : "The Police",
"album" : [{"name":null, "limit":5}]
}

3.2.14. The Sort Directive


Use the sort directive if you'd like the Metaweb server to sort the results of your query before
returning them. For example, to ask for the names of the tracks on an album in alphabetical order,
sort them by name:

// Tracks on the album Synchronicity, in alphabetical order


{
"type":"/music/album",
"name":"Synchronicity",
"artist":"The Police",
"track": [{
"name":null,
"sort":"name"
}]
}

As you can see, the sort directive simply specifies the name of the property by which the sort
is to be done. To order these same tracks from shortest to longest, use "length" as the sort key:

// Tracks on the album Synchronicity, from shortest to longest


{
"type":"/music/album",
"name":"Synchronicity",
"artist":"The Police",
"track": [{
"name":null,
"length":null,
"sort":"length"
}]
}

44 Developing Metaweb-Enabled Web Applications


Note that the query above includes "length":null. If you want to use a property as a sort key,
you must query that property.

To reverse this order, precede the name of the sort key by a minus sign:

// Tracks on the album Synchronicity, from longest to shortest


{
"type":"/music/album",
"name":"Synchronicity",
"artist":"The Police",
"track": [{
"name":null,
"length":null,
"sort":"-length"
}]
}

The sorts shown above are convenient, but could easily be duplicated on the client side. That is,
you could request unordered results from Metaweb and sort them yourself. One situation in which
the sort directive cannot be duplicated on the client is when it interacts with the limit directive.
Result sets are truncated to the specified limit after the sort is applied. Use sort and limit together
in queries like this:

// What is the longest track on Synchronicity?


{
"type":"/music/album",
"name":"Synchronicity",
"artist":"The Police",
"track": {
"name":null,
"length":null,
"sort":"-length",
"limit":1
}
}

(Note that explicitly specifying a limit of 1 means that we can safely omit the square brackets
from the query.)

Sorting need not be limited to a single sort key. To specify more than one key, use an array on
the right-hand side of the sort directive:

// List all tracks by The Police, sorted by album name and track name
[{
"type":"/music/track",
"artist":"The Police",
"name":null,
"album":null,
"sort":["album","name"]
}]

Chapter 3. The Metaweb Query Language 45


If your query includes sub-queries, then the properties of those sub-queries can also be used as
sort keys. The query below is a variation on the one above that uses this kind of hierarchically-
named sort key:

// List all tracks by The Police, on albums released before 1980,


// sorted by album name and track name
[{
"type":"/music/track",
"artist":"The Police",
"name":null,
"album":{
"name":null,
"release_date":null,
"release_date<":"1980"
},
"sort":["album.name","name"]
}]

Here is an example that uses the sort directive in two places:

// List all albums by The Police, along with the name of their longest track.
// Order the albums from longest longest track to shortest longest track.
[{
"type":"/music/album",
"artist":"The Police",
"name":null,
"track":{
"name":null,
"length":null,
"sort":"-length",
"limit":1
},
"sort":"-track.length"
}]

3.2.15. Ordered Collections


If you do not include a sort directive in a query then Metaweb returns unordered results to you.
In practice, with the current implementation, values are returned in the order that they were added
to the database. For ordered data, such as the list of tracks on an album, this insertion order is
often non-random. But there is no guarantee that this will always be the case. If you don't ask for
the data to be sorted, you should treat the result as an unordered set of values rather than an
ordered list. 7

Some data, such as the tracks on an album, have a natural order. If you want results to be sorted
according to this natural ordering, use "sort":"index". (Or, to reverse the natural ordering, use
"sort":"-index".

7
Metaweb's ordered collections are sometimes described as lists, but this term is inaccurate because lists are allowed to have duplicate
elements. Metaweb's ordered collections are still fundamentally sets, and duplicates are not allowed.

46 Developing Metaweb-Enabled Web Applications


// Return the tracks on the album Synchronicity in the order that they appear
{
"type":"/music/album",
"artist":"The Police",
"name":"Synchronicity",
"track":[{
"name":null,
"index":null,
"sort":"index"
}]
}

Since we've used "index" as a sort key, we must query the value of "index" as well. index is a
keyword in MQL and is not a true property of any object. It can be queried, however, and when
you do this, Metaweb returns a non-negative integer. It is important to understand that the notion
of order does not apply to objects in Metaweb, but to the relationships between objects. It is the
link between the album "Synchronicity" and the track "Mother" that has an index of 3, not the
track itself. This becomes clear when you consider the case of a track that appears on more than
one album. If "Mother" also appears on an album named "Greatest Hits" it is likely to have a
different index on that album. 8

Since index is not a true property, there are a lot of things you cannot do with it. You cannot
constrain the index with property names index> or index<. MQL read queries may use index as
a sort key, and they may query the index with "index":null, but may not use the keyword in
any other way. You cannot write "index":1 to ask for the second item in a set, for example. (The
index keyword can be used in other ways in write queries, however, and we'll learn about that
in Chapter 5).

The index keyword can be used in conjunction with the limit directive. Consider the following
query, which ask for the last two tracks on Synchronicity:

Query Result
{ {
"type":"/music/album", "type": "/music/album",
"artist":"The Police", "artist": "The Police",
"name":"Synchronicity", "name": "Synchronicity",
"track":[{ "track": [{
"name":null, "index": 1,
"index":null, "name": "Murder by Numbers"
"sort":"-index", },{
"limit":2 "index": 0,
}] "name": "Tea in the Sahara"
} }]
}

8
In Metaweb's schema tracks only appear on a single album: if multiple albums have a track by the same name, each one is a unique
object. So the example given here could not actually happen.

Chapter 3. The Metaweb Query Language 47


3.2.15.1. Indexes are Relative
The query above correctly returns the names of the final two tracks on the album Synchronicity.
Look carefully, however at the index values it returns: the last track is given an index of 1 and
the penultimate track an index of 0. This is not a bug: this query simply reveals the true nature
of ordered collections in Metaweb. Metaweb does not include an absolute index for each link.
The implementation is able to say whether any link is greater-than or less-than another, but it
cannot tell you the absolute position of that link within the complete set of links.

The number that Metaweb returns as the value of the index property is a synthetic one, generated
by Metaweb as a simple way to express the order of elements. If Metaweb returns an array
holding n elements, then it generates index values for those elements that range from 0 to n-1.
For example, if you ask for the last two tracks on an album, the resulting values have indexes 0
and 1. If you ask for tracks that are shorter than 2 minutes and Metaweb finds three of them, then
it will assign them index values of 0, 1, and 2. If you want to know the track number for the tracks
on a particular album, you must query the complete set of tracks. Then add one to the index value
to get the track number. If you want to know the track numbers of the short songs, you must
query the complete set of tracks, and search for the short songs yourself.

3.2.16. Optional Queries


In addition to the limit and sort directives, MQL also includes an optional directive. If part
of your query is not required to match, add "optional":true to it. For example, we can use the
optional directive to ask the question: "What bands have names that begin with "The", and do
they have a Greatest Hits album?". The query looks like this:

[{
"type" : "/music/artist",
"name" : null,
"name~=" : "^The",
"album":[{
"name":null,
"name~=": "greatest hits",
"optional":true
}]
}]

Without the optional directive, the query would only return bands whose name begins with The
who have released a Greatest Hits album. With the optional directive, we get all bands whose
name begins with The, and additionally, we get the name of any albums they have released that
include the phrase "greatest hits".

Optional queries can be nested inside optional queries. The following query is an extension to
the one above. It further asks for the names of tracks longer than 5 minutes, if any exist, on the
Greatest Hits album, if it exists.

[{
"type" : "/music/artist",
"name" : null,

48 Developing Metaweb-Enabled Web Applications


"name~=" : "^The",
"album":[{
"name":null,
"name~=": "greatest hits",
"optional":true,
"track": [{
"length":null,
"length>":300,
"name":null,
"index":null,
"sort":"index",
"optional":true
}]
}]
}]

Note that it is legal, but never necessary or useful, to add "optional":false to a query. Also, it
is never useful to use the optional directive in the top-level of a query. Queries are implicitly
optional at that level: if Metaweb can't find a match, it returns an empty result.

3.2.17. Using Fully-Qualified Property Names


Recall from the beginning of this tutorial that most objects in Metaweb have two or more types:

Query Result
{ {
"id":"#1f800000000006df1b", "id":"#1f800000000006df1b",
"name":null, "name":"The Police",
"type":[] "type":[
} "/common/topic",
"/music/artist"
]
}

What do you do if you want to query one property, such as a list of albums from one type, and
another property, such as a list of images, from a second type? MQL addresses this issue by al-
lowing you to specify a fully-qualified property name that includes the name of the type to which
it belongs. So here is how we ask for the albums by and pictures of, The Police:

{
"type":"/music/artist",
"name":"The Police",
"album":[],
"/common/topic/image":[{}]
}

The first line of this query specifies that the object to be matched should be of type /music/artist.
The second line specifies the name of the object. name and type are properties of /type/object,
and are shared by all objects in the database. These property names (along with id, guid, key,

Chapter 3. The Metaweb Query Language 49


timestamp, creator, and permission) can always be used without qualification (although you
can qualify them with /type/object if you want to). Other types are not allowed to define
properties whose names conflict with these.

The third line of the query asks for a property named album. This property is not defined by
/type/object, but it is defined by /music/artist, and the query has already declared that the
object will be an instance of that type. The fourth line asks for a property named image. This is
not defined by /type/object nor by /music/artist, and so we must qualify it with the name
of its type so that Metaweb can understand it.

For symmetry, and to be explicit, you can rewrite the query to fully-qualify both properties of
interest:

{
"type":"/music/artist",
"name":"The Police",
"/music/artist/album":[],
"/common/topic/image":[{}]
}

If you do this, you might be tempted to drop the initial type specification, since the album property
is now fully-qualified:

[{
"name":"The Police",
"/music/artist/album":[],
"/common/topic/image":[]
}]

Notice that we've put the toplevel query in square brackets now. This query will return any object
whose name is The Police, even if it has no album or image properties, and even if it is an instance
of neither /music/artist nor /common/topic.

Note that qualified property names use / as a delimiter and nested sort keys use . as a delimiter.
If your query uses qualified property names and sorts by those names, you may end up using
both delimiter characters. The following query is a variation on one shown earlier, in which two
of the properties have been (unnecessarily) qualified. Note the lengthy sort key:

// Police songs from albums released before 1990, sorted by album name
[{
"type":"/music/track",
"artist":"The Police",
"name":null,
"/music/track/album":{
"/type/object/name":null,
"release_date":null,
"release_date<":"1990"
},
"sort":"/music/track/album./type/object/name"
}]

50 Developing Metaweb-Enabled Web Applications


3.2.18. Wildcards
We saw wildcards earlier in this tutorial in the forms {"*":null} and {"*":{}}. But they are
somewhat more versatile than that. Consider the following query:

Query Result
{ {
"id":"#1f8000000002f9e349", "id":"#1f8000000002f9e349",
"*":null "guid":"#1f8000000002f9e349",
} "name": "Synchronicity",
"type": ["/music/album", "/common/topic"],
"key": [
"1299f319-8ff4-44fb-8440-7fb990972864",
"RELEASE3178"
],
"creator": "/user/mwcl_musicbrainz",
"permission": "/boot/all_permission",
"timestamp": "2006-11-30T13:42:18.0194Z"
}

This query identifies a unique object by ID, and then uses a wildcard to ask for all of its properties.
Since no type has been specified, the wildcard is expanded with all the properties of /type/object,
and the result is as shown above.

Note that some of the properties expand to a single value, and others to arrays. Thus the syntax
"*":null really means "*":null-or-[]. We could instead write the query using "*":[]. In this
case, all of the property are returned as arrays, even unique properties.

Now let's modify the query to specify a type other than the default of /type/object:

{
"type":"/music/album",
"name":"Synchronicity",
"*":null
}

In this query, the * wildcard expands differently. Since we have specified that the object is of
type /music/album, Metaweb looks up the properties of that type and queries each one with a
null or [], depending on whether the property is unique or not. It does this in addition to also
querying the common object properties shown in the query result above.

Note that if a property is explicitly listed in a query, a wildcard expansion will not overwrite it.
Consider this:

{
"type":"/music/album",
"name":"Synchronicity",
"track":[{}],
"*":null
}

Chapter 3. The Metaweb Query Language 51


This query explicitly asks for an array of tracks, as objects rather than just as track names. The
expansion of the wildcard would normally include "track":[], but in this case that property
would conflict with the explicitly specified one and will be left out of the expansion.

Wildcards can also be used in a second, more aggressive, form. "*":{} expands to query each
property with {} or [{}] instead of null or []. Similarly, "*":[{}] expands to query each
property, even unique properties, with [{}]. Let's repeat the query with which we began this
section, using "*":{} instead. With this query, each of the properties of /type/object is expanded
into a complete object, and the result is much longer. The long response is reproduced here in its
entirety because it serves as a useful review of the structure of some of the most fundamental
Metaweb data types:

Query Result
{ {
"id":"#1f8000000002f9e349", "id": "#1f8000000002f9e349",
"*":{} "guid": {
} "type": "/type/id",
"value": "#1f8000000002f9e349",
},
"name": {
"lang": "/lang/en",
"type": "/type/text",
"value": "Synchronicity"
},
"type": [{
"type": ["/type/type"],
"id": "/music/album",
"name": "Record album"
},{
"type": ["/type/type"],
"id": "/common/topic",
"name": "Topic"
}],
"key": [{
"type": "/type/key",
"namespace":
"/user/metaweb/datasource/MusicBrainz",
"value":
"1299f319-8ff4-44fb-8440-7fb990972864"
}, {
"type": "/type/key",
"namespace":
"/user/metaweb/datasource/MusicBrainz/name",
"value": "RELEASE3178"
}],
"creator": {
"type": ["/type/user"],
"id": "/user/mwcl_musicbrainz",
"name": null
},
"permission": {
"type": ["/type/permission"],

52 Developing Metaweb-Enabled Web Applications


Query Result
"id": "/boot/all_permission",
"name": "Global Write Permission"
},
"timestamp": {
"type": "/type/datetime",
"value": "2006-11-30T13:42:18.0194Z"
}
}

3.2.19. Expressing AND in Queries


MQL queries use JSON properties to express constraints. Those constraints are implicitly ANDed
together. Consider:

[{
"type":"/music/artist",
"name":null,
"name~=":"^The",
"album":"Greatest Hits"
}]

This query says: tell me the names of objects which have type "/music/artist" AND which have
a name that begins with "The" AND which have an album named "Greatest Hits".

Suppose we want to find the names of all bands who have an album named "Greatest Hits" AND
an album named "Super Hits". We might try this query:

[{
"type":"/music/artist",
"name":null,
"album":["Greatest Hits","Super Hits"] // Invalid MQL
}]

But this is not legal MQL. And if it was, it would probably mean find an artist who has recorded
exactly two albums, with names "Greatest Hits" and "Super Hits". A musical artist object may
have multiple album links to album objects. We want to constrain our query so that all result objects
have links to two specific album names. Here's a natural way to express this query:

[{
"type":"/music/artist",
"name":null,
"album":"Greatest Hits",
"album":"Super Hits" // Invalid JSON
}]

This query makes sense in the Metaweb object model: find objects that have one "album" link
to an album named "Greatest Hits" and another "album" link to an album named "Super Hits".
Unfortunately, this query is not valid JSON: since it includes the same property name twice, it
cannot be parsed into object form.

Chapter 3. The Metaweb Query Language 53


MQL's solution to this dilemma is to allow an arbitrary identifier and colon to prefix any property
name. The prefix and colon are ignored: they serve simply as a workaround to the JSON limitation
just described. With this trick we can rewrite the query above like this:

Query Result
[{ [{
"type":"/music/artist", "type": "/music/artist",
"name":null, "name": "Alice Cooper",
"a:album":"Greatest Hits", "a:album": "Greatest Hits",
"b:album":"Super Hits", "b:album": "Super Hits"
"limit":2 },{
}] "type": "/music/artist",
"name": "Dan Fogelberg",
"a:album": "Greatest Hits",
"b:album": "Super Hits"
}]

Note that the arbitrary prefixes we choose for the query are repeated in the result objects. The
prefixes are arbitrary, but they must be valid identifiers which means they cannot contain punc-
tuation characters and must not begin with a digit.

This property prefixing scheme is not limited to sets of two properties. And prefixed properties
can include operator suffixes. Let's find bands that have lots of hits and have recorded Christmas
albums:

[{
"type":"/music/artist",
"name":null,
"a:album":"Greatest Hits",
"b:album":"Super Hits",
"c:album~=":"christmas",
"c:album":[]
}]

Another use of property prefixes is to constrain a property and also query the property at the same
time. Let's find bands that have released a Greatest Hits album, and also ask for the names of all
the albums they have released:

[{
"type":"/music/artist",
"name":null,
"album":[],
"includes:album":"Greatest Hits",
}]

Note that although property prefixes are arbitrary, we can choose identifiers that add meaning to
our queries.

At the beginning of this tutorial, we wrote a query to determine the types of the object that rep-
resents The Police. In order to do this, we first asked for the id of The Police, and then used the

54 Developing Metaweb-Enabled Web Applications


resulting guid to uniquely identify the object so we could ask for its types. Property prefixes make
this easier:

Query Result
{ {
"constraint:type":"/music/artist", "constraint:type": "/music/artist",
"name":"The Police", "name": "The Police",
"query:type":[] "query:type": [
} "/music/artist",
"/common/topic"
],
}

As an interesting aside, let's return to the query with which we started this section. We want to
find bands that have released "Greatest Hits" and "Super Hits" albums. There is actually a way
to do this without property prefixes. It relies on the fact that Metaweb relationships are always
bi-directional and that MQL queries can be "turned inside out":

[{
"type":"/music/artist",
"name":null,
"album":[{
"name":"Greatest Hits",
"artist":{
"album":"Super Hits"
}
}]
}]

Translated into English, this query says: "give me the names of all bands that have released an
album named "Greatest Hits", the artist of which has released an album named "Super Hits". The
album property of a band object refers to an album object. And the artist property of the album
object refers back to the band object. We can use this fact to further constrain the artist. This
technique (some would say "hack") is worth understanding because it illustrates one of the deep
properties of Metaweb objects.

3.2.20. Expressing OR in Queries


MQL has no general-purpose way to express an OR relationship. Suppose we want a list of bands
who have released an album named "Greatest Hits" OR and album named "Super Hits". The ~=
pattern matching operator does not have an OR syntax, and there is no way to specify that an album
name match either "Greatest Hits" OR "Super Hits". The only way to do this is to make two
queries and combine their results: 9

[{
"type":"/music/artist",
"name":null,

9
It is possible to send two distinct queries in a single envelope with a single HTTP request. We'll learn how to do this in Chapter 4.

Chapter 3. The Metaweb Query Language 55


"album":"Greatest Hits"
}]

[{
"type":"/music/artist",
"name":null,
"album":"Super Hits"
}]

Combining the results of two queries is fairly straightforward. The only tricky issue is avoiding
duplicates. If a band appears in the results of both queries, for example, you would want to take
care that it did not appear twice in the combined result.

Combining property prefixes with a pattern matching operator and the optional directive, we
can achieve something vaguely like an OR operation:

[{
"type":"/music/artist",
"name":null,
"album":[],
"album~=":"hits",
"great:album":[{
"name":"Greatest Hits",
"optional":true
}],
"superb:album":[{
"name":"Super Hits",
"optional":true
}]
}]

This query returns all bands that have any albums whose name includes the word "hits". It returns
the names of those albums, and includes optional sub-queries for the particular names we're in-
terested in.

3.2.20.1. The |= Constraint


We began the discussion of expressing OR in queries by saying that MQL had no general-purpose
syntax for expressing OR. There is, however a specialized syntax for expressing OR, and it is
useful in a number of situations. |= can be used at the end of a property as a constraint like > or
~=. The value of such a constrained property should be a JSON array of strings. The constraint
says "match any one of the values in this array". (That is: match the first value OR the second
value OR the third value...)

The reason that this is not a general-purpose way to express OR in MQL is that it only works
when the strings in the array are object ids or guids. The meaning and use of the |= constraint
becomes much clearer with some examples. One straightforward use is to run the same query
over multiple objects that are specified by id. The following query asks for the properties of three
types:

56 Developing Metaweb-Enabled Web Applications


[{
"id|=":["/type/type", "/type/property", "/type/key"],
"id":null,
"/type/type/properties":[]
}]

This next example asks for the ids of GIF or PNG (but not JPEG) images:

[{
"type":"/common/image",
"id":null,
"/type/content/media_type":null,
"/type/content/media_type|=":[
"/media_type/image/gif",
"/media_type/image/png"
]
}]

Finally, here is an example that uses both the |= constraint to express an OR and uses a property
prefix to express AND. It asks for the French and Spanish translations of the country name
"England":

Query Result
{ {
"type":"/location/country", "type":"/location/country",
"english:name": "England", "english:name":"England",
"foreign:name": [{ "foreign:name":[{
"value":null, "value":"Angleterre",
"lang":null, "lang":"/lang/fr"
"lang|=":["/lang/fr","/lang/es"] },{
}] "value":"Inglaterra",
} "lang":"/lang/es"
}]
}

3.2.21. Expressing NOT in Queries


Metaweb has no syntax to perform logical NOT operations. In general, with huge universe of
knowledge, the NOT of a result may be a very, very large set of objects. There is not a way to
write a single query that says: list all bands who have a Greatest Hits album but do not have an
album that includes the word "Best". To do this, you'd first query the bands with a Greatest Hits
album. Then you'd query the bands who have "album~=":"best", and then you'd subtract the
results of the second query from the first query.

As another example, suppose you wanted to know what bands had an album named "Greatest
Hits", but wanted to exclude all country music. You could do one query for Greatest Hits albums,
and then do another for all country music bands (using the /music/artist/genre property), and
then subtract the second result from the first. This is not particularly efficient, since there are
probably a whole lot of country music bands. Better would be a single query for albums named

Chapter 3. The Metaweb Query Language 57


"Greatest Hits" that also asks for the genre of the album (with /music/album/genre). Then, parse
the JSON result, and post-process it yourself to remove albums whose genre is "country".

3.2.22. Reflective Queries


If you've enjoyed making queries against Freebase's repository of musical knowledge, you might
also enjoy querying the underpinnings of the Metaweb infrastructure. Types, properties and
namespaces are all Metaweb objects, and they can all be queried just like other objects. Here, for
example is how we find the properties of /type/object: 10

Query Result
{ {
"type":"/type/type", "type": "/type/type",
"id":"/type/object", "id": "/type/object",
"properties":[] "properties": [
} "/type/object/id",
"/type/object/guid",
"/type/object/type",
"/type/object/name",
"/type/object/key",
"/type/object/timestamp",
"/type/object/permission",
"/type/object/creator"
]
}

And let's ask about the name property:

Query Result
{ {
"type":"/type/property", "type": "/type/property",
"id":"/type/object/name", "id": "/type/object/name",
"*":null "guid": "#1f80000000000000ca",
} "name": "name",
"key": ["display_name", "name"],
"expected_type": "/type/text",
"unique": true,
"schema": "/type/object",
"master_property": null,
"reverse_property": [],
"creator": "/user/root",
"permission": "/boot/root_permission",
"timestamp": "2006-11-30T12:43:53.0081Z"
}

10
As a matter of convention, property names are usually singular, even when they are non-unique properties and multiple values are
expected. /type/type/properties is an exception to this rule.

58 Developing Metaweb-Enabled Web Applications


This kind of reflective query is not only useful for exploring the Metaweb infrastructure, but can
be helpful in understanding the schemas of types you actually want to use. Suppose you know
that the type /music/album has a property named track, and you want to know what type a track
is so that you can query tracks directly. This is actually a very easy query, if you understand how
types and properties work:

Query Result
{ {
"id":"/music/album/track", "id":"/music/album/track",
"/type/property/expected_type":null "/type/property/expected_type":"/music/track"
} }

Note that we omitted a type specification from this query and instead simply used the fully-
qualified name of the expected_type property.

If you were planning to write a program that made many music-related queries, you might first
want to explore all of Freebase's music-related types. But where do you get a list? You query the
domain: 11

Query Result
{ {
"type":"/type/domain", "type": "/type/domain",
"id":"/music", "id": "/music",
"types":[] "types": [
} "/music/group_membership",
"/music/recording_contribution",
"/music/track",
"/music/album",
"/music/album_release_type",
"/music/artist",
"/music/genre",
"/music/group_member",
"/music/instrument",
"/music/performance_role",
"/music/record_label",
"/music/voice"
]
}

11
/type/domain/types is another plural property name.

Chapter 3. The Metaweb Query Language 59


3.3. The MQL Grammar
This chapter began with an explanation of the JSON grammar. Although MQL uses JSON, it has
additional grammar rules layered on top of those imposed by JSON. The extended tutorial above
explained MQL by example. This section documents MQL grammar more formally. In the
grammar that follows, bold text indicates terminals that must appear literally and italics are used
for the terms defined by the grammar. Angle brackets indicate terms that are not formally defined,
such as <positive number> Spaces in the grammar are optional, and may usually be replaced
by zero or more whitespace characters. In places where no whitespace is allowed, a . is used to
indicate concatenation.

Let's begin at the top level. A query is a comma-separated list of one or more pairs, enclosed
in curly braces, which are optionally enclosed in square braces:

query:: { pairs } | [{ pairs }]


pairs:: pair ( , pair)*

A pair may be a property, a wildcard, a comparison, or a directive:

pair:: property | wildcard | comparison | directive

A property begins with a property name in quotation marks followed by a colon and a property
value. The property value may be a nested query, a JSON literal value or an "empty" value such
as null or []. As a special case, the index query "index": null is also allowed.

property:: " . property-name . " : property-value |


"index" : null

property-value:: query | literal | empty

literal:: <JSON-string> | <JSON-number> | true | false

empty:: null | [ ] | { } | [ { } ]

A property-name is either a simple-name or a qualified-name or a prefix and a colon followed


(without spaces) by a simple-name or qualified-name.

A simple-name is an identifier that is not a reserved word. A prefix is the same thing. A quali-
fied-name consists of a slash and one or more identifiers followed by slashes, all followed by a
simple-name. Finally, an identifier is a string of ASCII characters that begins with a letter and
consists of letters, numbers and underscores. Additionally, an identifier may not end with an
underscore or contain two underscores in a row. Reserved words include sort, limit, optional
and index, and also include directives used by the MQL write grammar (see Chapter 5) and a
number of other identifiers that are reserved for possible use in the future:

property-name:: simple-name |
qualified-name |
prefix . : . simple-name |
prefix . : . qualified-name
simple-name:: identifier <but not reserved-word>

60 Developing Metaweb-Enabled Web Applications


qualified-name:: / . (identifier . /)+ . simple-name
prefix:: identifier <but not reserved-word>
identifier:: <ASCII string matching /^[A-Za-z](_?[A-Za-z0-9])*$/>
reserved-word:: all | any | as | attribute | class
connect | count | create | cursor | datatype
default | delete | destroy | else | for
function | future | if | in | index
insert | is | left | limit | link
macro | meta | mql | offset | optional
pagesize | property | read | relationship | replace
return | right | scope | select | self
sort | sql | super | this | typeguid
update | var | while | write | xml

The remaining kinds of pair that can appear in a query are wildcard, comparison and directive.
A wildcard is an asterisk in quotes (to indicate a wildcard property name) followed by a colon
and an empty query (a null, [], {} or [{}]):

wildcard:: "*" : empty

A comparison is quoted name followed by a colon and a value, where the name includes an op-
erator and the value is a string or a number:

comparison:: " . comparison-name . " : comparison-value


comparison-name:: property-name . comparison-operator
comparison-value:: <JSON-string> | <JSON-number>
comparison-operator:: < | <= | > | >= | ~=

A directive is a limit, optional or sort directive:

directive:: "limit" : <positive number> |


"optional" : true | "optional" : false |
sort-directive

The sort directive syntax requires further explanation. A sort directive is the keyword "sort"
followed by a colon and a sort key or an array of sort keys. A sort key is the keyword "index"
or one or more property names separated with . characters:

sort-directive:: "sort" : sort-keys


sort-keys:: sort-key | [ ( sort-key , )* sort-key ]
sort-key:: "index" |
" . property-name ( . property-name )* "

The MQL grammar for updating Metaweb has some additional rules. We'll learn about those in
Chapter 5.

Chapter 3. The Metaweb Query Language 61


62
Chapter 4. Metaweb Read Services
Chapter 3 explained how to express Metaweb queries using MQL. This chapter explains how to
deliver those queries to Metaweb servers and retrieve their response using the mqlread service.
It also explains how to retrieve chunks of data (such as images and HTML documents) using the
trans service. The chapter includes many example applications, written in Perl, Python, PHP,
and JavaScript.

4.1. Basic mqlread Queries with Perl


Metaweb's services are all implemented on top of the HTTP protocol. Submitting a MQL query
and retrieving the response, therefore, is simply a matter of constructing the appropriate URL
and fetching its content via an HTTP request.

The basic URL for submitting MQL queries to freebase.com is:

http://www.freebase.com/api/service/mqlread

To submit a query to the mqlread service, place the query inside an "envelope" object. Next, use
JSON to serialize the query object. Then URL encode the JSON string and prefix it with ?quer-
ies=. Finally, append the whole thing to the URL above, and retrieve the content of the resulting
URL with an HTTP GET request.

The metaweb-user Cookie

While freebase.com is being rolled out and scaled up, the mqlread service at that site requires
HTTP cookie headers for authentication. If you are an early user at freebase.com, this issue
will affect you. Some of the examples in this chapter demonstrate how to correctly log in
to freebase.com , obtain credentials, and submit them to the mqlread service using HTTP
cookies. Others, like the one that follows, simply ask you to hardcode your cookie data
into the script.

To find the authentication data you need, first visit the freebase.com home page and make
sure you are logged in. This will ensure that your browser has stored the necessary cookies.
Obtaining cookie data is a browser-specific task. If you are using Firefox, pull down the
Edit menu and select Preferences. Click the Show Cookies... button in the dialog that
appears. This will open a second dialog. Enter "freebase.com" into the Search box, and
then scroll through the resulting list of cookies until you find the one named metaweb-user.
Highlight this entry in the list to view the content of the cookie. It will probably something
like this:

A|u_docs|g_#1f8000000001209013|4.xwItasvXuoVOQiXg3Sm04b

This is the string you'll need to enter into Example 4.1 and Example 4.6. (You have to use
your own, though: the cookie shown here is not actually valid.)

63
Example 4.1 is a command-line utility that list the albums released by any band you specify. It
uses the Metaweb API to retrieve data from freebase.com. It is written in Perl, and demonstrates
how to nest an MQL query within an envelope and send that envelope to to the mqlread service.
(The structure of the envelope object will be explained in Section 4.2.3.) It sends hard-coded
authentication credentials to mqlread using HTTP cookies. Until freebase.com has fully opened
its services to the world, you'll have to insert your own cookie data into this script to make it
work.

Example 4.1. albumlist.pl: submitting MQL queries in Perl

#!/usr/bin/perl
use URI::Escape; # This module provides the uri_escape function used below

# Build the Metaweb query, using string manipulation


# CAUTION: the use of string manipulation here makes this script vulnerable
# to MQL injection attacks when the command-line argument includes JSON.
$band = $ARGV[0]; # This is the band or musician whose albums are to be listed
$query='{"type":"/music/artist","name":"' . $band . '","album":[]}';

# Now place the query in JSON envelope objects, and URL encode the envelopes
$envelope = '{"qname":{"query":' . $query . '}}';
$escaped = uri_escape($envelope);

# Construct the URL that represents the query


$baseurl='http://www.freebase.com/api/service/mqlread'; # Base URL for queries
$url = $baseurl . "?queries=" . $escaped;

# During Freebase's roll-out, authentication data must be encoded in a cookie


# Enter your cookie data below.
$auth = 'metaweb-user=###Enter Your cookie data here###';

# Use the command-line utility curl to supply the cookies and fetch the
# content of the URL.
$result = `curl -s --cookie \'$auth\' $url`;

# Use regular expressions to extract the album list from the HTTP response
$result =~ s/^.*"album"\s*:\s*\[\s*([^\]]*)\].*$/$1/s;
$result =~ s/[ \t]*"[ \t,]*//g;

# Finally, display the list of albums


print "$result\n";

4.1.1. A Better Perl Album Lister


The first thing to notice about Example 4.1 is that it does not use a JSON serializer or parser: a
JSON-encoded MQL query is constructed with string concatenation and the desired results are
extracted with regular expressions. These shortcuts keep the example simple and allow us to focus
on how the mqlread URL is built and its content fetched. More sophisticated applications, however,
use a JSON encoder to serialize the query and a JSON decoder to parse the result. Building
queries with string manipulation can be reasonable in the simplest applications (though caution

64 Developing Metaweb-Enabled Web Applications


is required to avoid MQL injection attacks), but attempting to extract results with regular expres-
sions is brittle and not a technique to emulate in your own code!

Example 4.2 is a higher-level version of Example 4.1. It uses a JSON serializer and parser and
also a higher-level API for URL manipulation. To use it, you must have the JSON.pm module
(which you can find at http://search.cpan.org) installed. This version of the program also uses a
somewhat more sophisticated query to sort albums by their release date, and also does error
checking and error reporting in case anything goes wrong with the query. Finally, this version
of the program demonstrates how to log in to Metaweb to obtain authentication credentials. Instead
of hardcoding your cookie data into the script, you must instead hardcode your Freebase username
and password. Although this example uses the Metaweb login service, that service is not formally
documented until Chapter 6.

Note that Example 4.2 places the query in the inner and outer envelope objects before JSON
serialization. Note also that the mqlread service returns the query results in its own two-layer
response envelope object. The outer object includes a property with the same name as the outer
object of the query envelope. The inner object of the response has a property named result. The
value of the result property is the result of the MQL query.

Example 4.2. albumlist2.pl: a better Perl album lister

#!/usr/bin/perl -w
use strict; # Don't allow sloppy syntax
use JSON; # JSON encoding and decoding
use URI::Escape; # URI encoding
use LWP::UserAgent; # High-level HTTP API

# Some constants for this script


my $SERVER = 'http://www.freebase.com'; # The Metaweb server
my $QUERYURL = $SERVER . '/api/service/mqlread'; # Path to mqlread service
my $LOGINURL = $SERVER . '/api/account/login'; # Path to login service
my $USERNAME = 'user'; # Enter your Freebase username name
my $PASSWORD = 'pass'; # Enter your Freebase password here

# Create the HTTP "user agent" we'll use to send the query
my $ua = LWP::UserAgent->new;

# Login to Metaweb to get authentication credentials for our UA object.


# This will add an authentication cookie to subsequent HTTP requests.
# The login() subroutine is defined below.
&login($ua, $USERNAME, $PASSWORD);

# What band did the user ask about?


my $band = $ARGV[0];

# Construct a Metaweb query as a Perl data structure


my $query = {
type => "/music/artist", # We're looking for a band
name => $band, # This is the name of the band
album => [{ # Return some albums
name => undef, # undef is Perl's null

Chapter 4. Metaweb Read Services 65


sort => "release_date", # sort by release date
release_date => undef # return release date, too
}]
};

# Put the query in an envelope object


my $envelope = { # This is the outer envelope object
albumquery => { # "albumquery" is an arbitrary name for inner envelope
query => $query # The "query" property of inner envelope holds query
} # End of inner envelope
}; # End of outer envelope

# Convert the envelope object from Perl hash to JSON string, and URI encode it
my $json = JSON->new(); # Create JSON parser/serializer
my $encoded = $json->objToJson($envelope); # Serialize object to string
my $escaped = uri_escape($encoded); # URI encode the string

# Build the complete query url


my $url = $QUERYURL . "?queries=" . $escaped;

# Send request to the server and get the response


my $response = $ua->get($url);

if ($response->is_success) { # If we get HTTP 200 OK


my $responsetext = $response->content; # Get result as JSON text
my $outer=$json->jsonToObj($responsetext); # Parse text to a Perl hash
my $inner = $outer->{albumquery}; # Open outer envelope

if ($inner->{status} ne "/mql/status/ok") { # If the query was not okay


my $err = $inner->{messages}[0]; # Get the error message obj
die $err->{status}.': '.$err->{message}; # and exit with error message
}

my $result = $inner->{result}; # Open inner envelope


my $albums = $result->{album}; # Get albums from result
for my $album (@$albums) { # Loop through albums
print "$album->{name}"; # Print the name of each
if ($album->{release_date}) { # Print release date
print " [" . substr($album->{release_date},0,4) . "]";
}
print "\n"; # Add a newline
}
}
else { # If query failed
die "Server returned error code " . $response->code . "\n";
}

# This subroutine calls the Metaweb login service to obtain authentication


# credentials. It asks the UA to send those credentials as cookies in
# all future requests.
sub login {

66 Developing Metaweb-Enabled Web Applications


my($ua, $username, $password);
($ua, $username, $password) = @_; # Get subroutine arguments

# Post username and password to the login service


my $res = $ua->post($LOGINURL, {username=>$username,password=>$password});

my $raw = $res->header('Set-Cookie'); # Get raw cookies from the response


die "Login failed" if !$raw; # If none, then login failed
my @cookies = split(', ',$raw); # Break cookies at commas

# Each cookie is broken into fields with semicolons.


# We want the only first field of each cookie
my $credentials = ''; # We'll accumulate login credentials here
for my $cookie (@cookies) { # Loop through cookies
my @parts = split(";", $cookie); # Split each one on ;
$credentials = $credentials . $parts[0] . ';'; # Remember first part
}
chop($credentials); # Remove trailing semicolon

# Tell the UA to send our credentials on every request


$ua->default_header('Cookie' => $credentials);
}

4.2. The mqlread Service


Now that we've seen some working code, this section explains more formally how mqlread works.
As you've seen in the preceding examples, the URL for the mqlread service on freebase.com is:

http://www.freebase.com/api/service/mqlread

The sub-sections that follow document mqlread input and output, and also specify the format of
the query and response envelopes.

4.2.1. mqlread Input


The mqlread service responds to HTTP GET requests. Parameters to the service are encoded into
the URL as name/value pairs following the ? character. The following parameters are supported:

queries The value of this required parameter is an JSON-encoded and URI-encoded "en-
velope" object that holds the query or queries to be executed. The format of the
envelope is described in Section 4.2.3.

callback The optional callback parameter allows you to submit a request to mqlread via
a <script> tag. It affects the behavior of mqlread in the following ways:

• The response object is wrapped within a JavaScript function invocation, and


the value of the callback parameter is used as the name of the function.

• The Content-Type header of the response is set to text/plain instead of ap-


plication/json.

Chapter 4. Metaweb Read Services 67


4.2.2. mqlread Output
mqlread returns an HTTP response with a Content-Type header of application/json (or
text/plain if the callback parameter was specified). The body of the response is a JSON-seri-
alized envelope object that holds a MQL result object (or objects if multiple queries were submit-
ted). The format of mqlread response envelopes is specified in Section 4.2.3

4.2.3. Query and Response Envelopes


MQL queries must be nested inside two JSON objects before being sent to mqlread. Similarly,
the MQL response sent my mqlread is nested inside in two layers of JSON objects. These wrapper
objects are known as envelopes. To understand the envelope metaphor, imagine that the internet
is actually run by the postal service...

Suppose that we have two MQL queries that we want mqlread to execute. We write the first
query on a piece of paper, fold it up and place it in an envelope. (This is the "inner query envelope
object"). We name this query "q0" and write those letters on the envelope. Next we write the
second query on another piece of paper. We put that paper in another envelope, and write "q1"
on that envelope. Finally, we place both envelopes within a cardboard box (the outer query en-
velope object) and mail the box off to http://www.freebase.com/api/service/mqlread.

The mqlread service opens the box and opens the envelopes it contains. It executes the two
queries and writes the results on two pieces of paper. It places the results of the first query in an
envelope (the inner response envelope) and writes "q0" on that envelope. It does the same for
the results of the second query, and writes "q1" on that envelope. Then it puts the two envelopes
in a box (the outer response envelope) and mails the box back to us.

The envelopes and cardboard boxes in our postal metaphor are just JSON objects, of course.
Here's the two-layer query envelope described above looks like in JSON:

{ # This is the outer envelope object


"q0": { # This is the first inner envelope. The name "q0" is arbitrary
"query": { # The first MQL query goes here
}
},
"q1": { # This is the second inner envelope
"query": [{ # Second MQL query goes here. Note that this one is in []
}]
}
}

The response envelope has the same structure as the query envelope. In JSON it might look like
this:

{ # This is the outer envelope object


"q0": { # This is the first inner envelope. The name "q0" is arbitrary
"status":"/mql/status/ok" # The query was successful
"result": { # The result of the first MQL query goes here
}

68 Developing Metaweb-Enabled Web Applications


},
"q1": { # This is the second inner envelope
"status":"/mql/status/ok" # The query was successful
"result":[{ # The result of the second MQL query goes here.
}]
}
}

The outer and inner query and response envelopes are described more formally below:

4.2.3.1. The Outer Query Envelope


The outer query envelope is a JSON object with one or more properties. The value of each
property must be an inner envelope object. The name of each property defines a name for the
query that is included in the inner envelope. The response envelope sent by mqlread includes a
property by the same name.

4.2.3.2. The Inner Query Envelope


The inner query envelope is a JSON object that must have a property named query. The value
of this property is a MQL query.

The inner query envelope may include other properties that provide additional information about
how the query should be executed. At the time of this writing the only such property is cursor
which is documented in Section 4.7.

4.2.3.3. The Outer Response Envelope


The outer response envelope has the same properties as the outer query envelope. Each of these
properties names a query, and its value is the inner response envelope for that query.

4.2.3.4. The Inner Response Envelope


Each inner response envelope object has a status property. (Note: the mqlread implementation
may also include a status property in the outer response envelope. This outer status property
is not the same and should not be used.) If the query was successful, then the status property is
"/mql/status/ok". In this case, the inner response envelope also has a property named result,
and the value of this property is the MQL result of the query.

If, on the other hand, something was wrong with the query, then the status property will be
"/mql/status/error", and the inner response envelope will have a messages property whose
value is a JSON array of message objects that provide details about the error or errors. mqlread
error messages are documented in Section 4.6.

Chapter 4. Metaweb Read Services 69


4.3. A Python Album Lister
Now that we've seen how mqlread works in more formal detail, let's return to example code, and
re-write our album listing script in Python. Example 4.3 is a Python module that defines the
utility function metaweb.read(). This function:

• takes a MQL query (as a Python data structure, not as JSON-serialized text) as its argument;

• wraps the query in inner and outer envelope objects;

• serializes the outer envelope object to a JSON string;

• URI encodes the serialized envelope;

• sets the URL queries parameter to the serialized and encoded query

• if authentication credentials are passed to the function, it uses them in a Cookie header of the
HTTP request.

• obtains the query result, in text form, by fetching the contents of the URL;

• parses the JSON string returned by mqlread into a Python data structure;

• opens the outer response envelope to get the inner envelope

• checks the status property in the inner response envelope to determine if the query was suc-
cessful (If the query fails, it extracts the error message from the inner envelope and raises an
exception.)

• gets the query result from the inner envelope and returns it as a Python data structure

This code relies on the simplejson module for JSON encoding and parsing. You can find the
simplejson code at http://cheeseshop.python.org/pypi/simplejson.

Example 4.3. metaweb.py: using mqlread with Python

import urllib # URL encoding


import urllib2 # Higher-level URL content fetching
import simplejson # JSON serialization and parsing

host = 'www.freebase.com' # The Metaweb host


readservice = '/api/service/mqlread' # Path to mqlread service

# Submit the MQL query q and return the result as a Python object.
# If authentication credentials are supplied, use them in a cookie.
# Raises MQLError if the query was invalid. Raises urllib2.HTTPError if
# mqlread returns an HTTP status code other than 200 (which should not happen).
def read(q, credentials=None):
# Put the query in an envelope
env = {'qname':{'query':q}}
# JSON serialize and URL encode the envelope and the query parameter

70 Developing Metaweb-Enabled Web Applications


args = urllib.urlencode({'queries':simplejson.dumps(env)})
# Build the URL and create a Request object for it
url = 'http://%s%s?%s' % (host, readservice, args)
req = urllib2.Request(url)

# Send our authentication credentials, if any, as a cookie.


# The need for mqlread authentication is a temporary restriction.
if credentials:
req.add_header('Cookie', credentials)

# Now upen the URL and and parse its JSON content
f = urllib2.urlopen(req) # Open the URL
response = simplejson.load(f) # Parse JSON response to an object
inner = response['qname'] # Open outer envelope; get inner envelope

# If anything was wrong with the invocation, mqlread will return an HTTP
# error, and the code above with raise urllib2.HTTPError.
# If anything was wrong with the query, we won't get an HTTP error, but
# will get an error status code in the response envelope. In this case
# we raise our own MQLError exception.
if inner['status'] != '/mql/status/ok':
error = inner['messages'][0]
raise MQLError('%s: %s' % (error['status'], error['message']))

# If there was no error, then just return the result from the envelope
return inner['result'];

# If anything goes wrong when talking to a Metaweb service, we raise MQLError.


class MQLError(Exception):
def __init__(self, value): # This is the exception constructor method
self.value = value
def __str__(self): # Convert error object to a string
return repr(self.value)

With the metaweb.read() function defined, we can now write our album listing code in Python.
Example 4.4 shows how we do this. Note that this example uses the metaweb.login() function
for authentication. The implementation of this function is in Chapter 6.

Example 4.4. albumlist.py: listing albums in Python

import sys
import metaweb # Defines the metaweb.read() and login() functions

# Compose our MQL query using a Python data structure


band = sys.argv[1] # The band we want
query = { 'type': '/music/artist', # Our MQL query in Python
'name': band,
'album': [{ 'name': None, # None is Python's null
'release_date': None,
'sort': 'release_date' }]}

Chapter 4. Metaweb Read Services 71


# Login to get authentication credentials
# Insert your Freebase username and password here.
credentials = metaweb.login("username", "password");

# Submit the query using metaweb.read() and check for valid results
result = metaweb.read(query, credentials)
if not result or not result['album']: sys.exit('Unknown band')

# A utility function to get year from a MQL datetime value


def getYear(date):
if not date: return ''
return "[%s]" % date[0:4]

# Now output the results


for album in result['album']:
print "%s %s" % (album['name'], getYear(album['release_date']))

4.4. A Metaweb-enabled PHP Web Applica-


tion
In this section, we'll demonstrate how to create an online version of our album-lister application.
We'll use the the server-side scripting language PHP to create the web application that was shown
in Figure 1.2 of Chapter 1. Example 4.5 is a PHP file that defines a class named Metaweb. This
class has a single method, named read that behaves just like the metaweb.read() function defined
in Example 4.3.

The code in Example 4.5 is commented and you should be able to follow it even if you are not
familiar with PHP. One point to note is that in PHP the data structure known as an array works
as both a sequential array and as an associative array. That is, JSON objects and JSON arrays
are both arrays in PHP. Example 4.5 depends on an external module for JSON serialization and
parsing. The module used here is from http://pear.php.net.

Example 4.5. metaweb.php: using mqlread with PHP

<?php
/*
* The Metaweb class defines a read() method for invoking the Metaweb
* mqlread service on freebase.com. read() takes a MQL query (as a PHP
* array) and freebase.com authentication credentials. It sends
* that query to the mqlread service and retrieves the response. It parses
* the response to a PHP array, and extracts the query result from the
* response envelopes and returns it. If the query fails, it returns
* null (without providing useful diagnostics).
*/
require "JSON.php"; // A JSON encoder/decoder from http://pear.php.net

class Metaweb {
var $json; // Holds the JSON encoder/decoder object
var $URL = "http://www.freebase.com/api/service/mqlread";

72 Developing Metaweb-Enabled Web Applications


// Our constructor function sets up the JSON encoder/decoder
function Metaweb() {
// Set up our JSON encoder and decoder object
$this->json = new Services_JSON(SERVICES_JSON_LOOSE_TYPE);
}

// This method submits a query and synchronously returns its result.


// If authentication credentials are passed, it uses them as an HTTP Cookie.
function read($queryobj, $credentials) {
// Put the query into an envelope object
$envelope = array("qname" => array("query" => $queryobj));

// Serialize the envelope object to JSON text


$querytext = $this->json->encode($envelope);

// Then URL encode the serialized text


$encoded = urlencode($querytext);

// Now build the URL that represents the query


// Note that we use an HTTP GET request for read queries
$url = $this->URL . "?queries=" . $encoded;

// Use the curl library to send the query and get response text
$request = curl_init($url);

// Return the result instead of printing it out.


curl_setopt($request, CURLOPT_RETURNTRANSFER, TRUE);

// If we have credentials, send them with the request as a cookie


if ($credentials) curl_setopt($request, CURLOPT_COOKIE, $credentials);

// Now fetch the URL


$responsetext = curl_exec($request);
curl_close($request);

// Parse the server's response from JSON text into a PHP array
$response = $this->json->decode($responsetext);

// Return null if the query was not successful


if ($response["qname"]["status"] != "/mql/status/ok") return null;

// Otherwise, open the envelope and just return the actual result object
return $response["qname"]["result"];
}
}
?>

With the PHP utility function defined in Example 4.5, it becomes easy to write simple Metaweb-
enabled web applications in PHP. Example 4.6 demonstrates. It displays an HTML form in which

Chapter 4. Metaweb Read Services 73


the user can enter the name of a band. When the form is submitted, it lists the albums by that
band.

Like Example 4.1, this example requires you to hard-code the value of your freebase.com authen-
tication cookie into the script. See the instructions earlier in this chapter for finding the value of
the metaweb-user cookie.

Example 4.6. albumlist.php: A Metaweb-enabled web application in PHP

<html>
<body>
<form>Band: <input type="text" name="band"><input type="submit"></form>
<?php
$band = $_GET["band"]; // What band is specified in the URL?
if ($band) { // Only list albums if a band has been specified
require "metaweb.php"; // Import Metaweb utility code
$metaweb = new Metaweb(); // Create a Metaweb object

// Build a MQL request for the list of albums by the band


$query = array("type" => "/music/artist", // We want an musical artist
"name" => $band, // This is its name
"album" => array()); // Fill in this empty albums array!

// Insert your own freebase.com cookie data into the string below
$credentials = 'metaweb-user=### Put your cookie data here ### ';

// Submit the query using the utility function defined earlier


$result = $metaweb->read($query, $credentials);

// This is the array of albums we want


$albums = $result["album"];

// Display the albums on the web page


echo "<hr><h1>Albums by " . $band . "</h1>";
foreach ($albums as $album) echo $album . "<br>";
}
?>
</body>
</html>

4.5. Metaweb Queries with JavaScript


Since the MQL syntax is based on JSON, Metaweb queries are most gracefully expressed in
JavaScript. We haven't seen a JavaScript-based Metaweb application so far for one important
reason: the same-origin policy. The same-origin policy is a sweeping (but necessary) security
restriction in JavaScript that says that code embedded in a document that was served by server
A can only interact with content that is also served by server A. This restriction applies to the
XMLHttpRequest object which is what is typically used to fetch the contents of a URL. A web
application hosted at www.freebase.com can use XMLHttpRequest to submit MQL queries to the

74 Developing Metaweb-Enabled Web Applications


mqlread service on that same server, but this is not allowed for web applications hosted on any
other server.

There are two workarounds to this restriction. The first, and most obvious, is to run a proxy script
on your own site that behaves like the mqlread service but simply forwards your query to free-
base.com.

The second workaround relies on the fact that a query result, in JSON format, is valid JavaScript
code. This means that a mqlread URL can be used as the value of the src attribute of a <script>
tag. When the server returns its result, the <script> tag evaluates the JSON text as JavaScript
code. Evaluating the JSON text creates the JavaScript object we want, but to make this scheme
work, the script then has to be able to do something with that object. The solution is to add another
URL parameter to the mqlread service. If the URL for your query includes a callback= parameter,
then mqlread will take the value of that parameter to be the name of a JavaScript function. Then,
instead of simply returning a JSON text, it will return the specified function name, an open par-
enthesis, the JSON text and a close parenthesis. When used this way with a <script> tag, the
JSON text is evaluated, and the object that results is passed to the specified function (which you
have defined previously).

We use the <script> technique in this chapter: it is simple, elegant and in common use across
the internet. If you prefer a proxy and XMLHttpRequest-based approach, you can find sample
proxy code in Appendix A.

One thing you'll notice about our JavaScript examples here (and in in Appendix A) is that they
are asynchronous: when you submit a query you do not get the result immediately. Instead, the
callback function you specify is invoked when the result is available. This asynchronous program-
ming model is common in client-side JavaScript, but is substantially different from the synchronous
model demonstrated in Example 4.5 and other examples.

JavaScript Web Apps and Cookies

The JavaScript-based web applications shown in this chapter do not attempt to automatically
obtain authentication credentials. They simply assume that you have visited and logged in
to www.freebase.com before you attempt to run them. If you do this, then your browser
will have the authentication cookies it needs, and it will automatically pass those cookies
to mqlread when <script> tags invoke it.

4.5.1. Listing Albums and Tracks with JavaScript


Let's jump right in. Example 4.7 is a JavaScript-based album and track lister application, pictured
in Figure 4.1. Notice that this code depends on two external modules. json.js is a file that defines
JavaScript functions for parsing and serializing JSON. The code for this module is not shown
here; you can find it as Example A.1 in Appendix A. The second module of external code is
metaweb.js. This module, whose code listed in Example 4.8, defines the utility function
Metaweb.read() that submits a MQL query, through a <script> tag, to the mqlread service.

Chapter 4. Metaweb Read Services 75


Figure 4.1. Listing albums and tracks

Example 4.7 lists the albums by a specified band, and also displays the tracks on an album when
the user clicks on the name of the album. There are several features worth noting in this example.
First, notice that this code uses the Metaweb.read() function to send queries to the mqlread service.
We'll see how Metaweb.read() is implemented in the next section. Second, note that the code
displays a "Loading..." message to the user while the queries are pending and also displays an
appropriate message if a query fails. Finally, Example 4.7 demonstrates two different ways to
insert Metaweb query results into an HTML document. When the album query returns, the album
list is populated using DOM methods to build each text node and <div> tag. When the track query
returns, on the other hand, the list of tracks is built as a string of HTML text and is inserted into
the document by setting the innerHTML property of the container element.

In addition to querying album and track names, the queries in this example also ask for album
release date and track length. The example includes utility functions to massage this data, extracting
a year from a /type/datetime string and converting a track length in seconds into the more fa-
miliar mm:ss format.

Example 4.7. albumlist.html: a JavaScript album and track lister

<html>
<head>
<script src="json.js"></script> <!-- JSON utilities -->

76 Developing Metaweb-Enabled Web Applications


<script src="metaweb.js"></script> <!-- Metaweb.read() function -->
<script>
/* Display albums by the specified band */
function listalbums(band) {
// Find the document elements we need to insert content into
var title = document.getElementById("title");
var albumlist = document.getElementById("albumlist");
var tracklist = document.getElementById("tracklist");

title.innerHTML = "Albums by " + band; // Set the page title


albumlist.innerHTML = "<b><i>Loading...</i></b>" // Album list is coming...
tracklist.style.visibility = "hidden"; // Hide any old tracks

var query = { // This is our MQL query


type: "/music/artist", // Find a band
name: band, // With the specified name
album: [{ // We want to know about albums
name:null, // Return album names
release_date:null, // And release dates
sort: "release_date" // Order by release date
}]
};

// Issue the query and invoke the function below when it is done
Metaweb.read(query, displayAlbums);

// This function is invoked when we get the result of our MQL query
function displayAlbums(result) {
// If no result, the band was unknown.
if (!result || !result.album) {
albumlist.innerHTML = "<b><i>Unknown band: " + band + "</i></b>";
return;
}

// Otherwise, the result object matches our query object,


// but has album data filled in.
var albums = result.album; // the array of album data
// Erase the "Loading..." message we displayed earlier
albumlist.innerHTML = "";
// Loop through the albums
for(var i = 0; i < albums.length; i++) {
var name = albums[i].name; // album name
var year = getYear(albums[i].release_date); // album release year
var text = name + (year?(" ["+year+"]"):""); // name+year

// Create HTML elements to display the album name and year.


var div = document.createElement("div");
div.className = "album";
div.appendChild(document.createTextNode(text));
albumlist.appendChild(div);

Chapter 4. Metaweb Read Services 77


// Add an event handler to display tracks when an album is clicked
div.onclick = makeHandler(band, albums[i].name);
}

// This function returns a function. We do it this way to create


// a closure that captures the band and album names.
function makeHandler(band, album) {
return function(e) { listtracks(band, album); }
}
}

// A utility function to return the year portion Metaweb /type/datetime


function getYear(date) {
if (!date) return null;
if (date.length == 4) return date;
if (date.match(/^\d{4}-/)) return date.substring(0,4);
return null;
}
}

/* Display the tracks on the specified album by the specified band */


function listtracks(band, albumname) {
// Begin by displaying a Loading... message
var tracklist = document.getElementById("tracklist");
tracklist.innerHTML = "<h2>" + albumname + "</h2><p>Loading...";
tracklist.style.visibility = "visible";

// This is the MQL query we will issue


var query = {
type: "/music/album",
name: albumname,
artist: band,
// Get track names and lengths, sorted by index
track: [{name:null, length:null, index:null, sort:"index"}]
};

// Issue the query, invoke the nested function when the response arrives
Metaweb.read(query, function(result) {
if (result && result.track) { // If result is defined
var tracks = result.track; // array of tracks
// Build an array of track names + lengths
var listitems = []
for(var i = 0; i < tracks.length; i++) {
var n = tracks[i].name + " (" +
toMinutesAndSeconds(tracks[i].length)+")";
listitems.push(n);
}
// Display the track list by setting innerHTML
tracklist.innerHTML = "<h2>" + albumname + "</h2>" +
"<ol><li>" + listitems.join("<li>") + "</ol>";
}

78 Developing Metaweb-Enabled Web Applications


else {
// If empty result display error message
tracklist.innerHTML = "<h2>" + albumname + "</h2>" +
"<p>No track list is available.";
}
});

// Convert track length in seconds to minutes:seconds format


function toMinutesAndSeconds(seconds) {
var minutes = Math.floor(seconds/60);
var seconds = Math.floor(seconds-(minutes*60));
if (seconds <= 9) seconds = "0" + seconds;
return minutes + ":" + seconds;
}
};
</script>
<!-- A CSS stylesheet to make the output look nice -->
<style>
#albumlist { width:50%; padding: 5px; }
#tracklist {
width: 45%; float:right; visibility:hidden;
padding: 5px; border: solid black 2px; margin-right: 10px;
background-color: #8a8;
}
#tracklist h2 { font: bold 16pt sans-serif; text-align: center;}
#tracklist p { text-align: center; font: italic bold 12pt sans-serif; }
#tracklist li { font-style: italic;}
div.album { font: bold 12pt sans-serif; margin: 2px;}
div.album:hover {text-decoration: underline;}
</style>
</head>
<body>
<!-- The HTML form in which the user can enter the name of a band -->
<!-- It invokes listalbums() when the user hits Return or clicks the button -->
<form onsubmit="listalbums(this.band.value); return false;">
<b>Enter the name of a band: </b>
<input type="text" name="band">
<input type="submit" value="List Albums">
</form>
<hr>
<!-- This is where we insert the results of our Metaweb queries -->
<h1 id="title"></h1> <!-- display band name here -->
<div id="tracklist"></div> <!-- list tracks here -->
<div id="albumlist"></div> <!-- list of albums here -->
</div>
</body>
</html>

Chapter 4. Metaweb Read Services 79


4.5.2. Client-side MQL Queries with <script>
In this section, we develop the Metaweb.read() utility function used by Example 4.7. The code
in Example 4.8 is short but somewhat complicated. The key to understanding it is to realize that
each call to Metaweb.read() defines a function with a name like Metaweb._3() (the number is
different on each invocation). This function does the work of processing the response from the
Metaweb server. In order to get this function invoked, Metaweb.read() adds a callback parameter
to the mqlread query URL, like this:

&callback=Metaweb._3

When the mqlread service is invoked with this callback parameter, it does not return the result
as a pure JSON object. Instead it returns JavaScript code. The code is simply a function invocation
of the function named by the parameter. The invocation includes a JSON object as the single ar-
gument to the function:

Metaweb._3(/* JSON object goes here */)

Since JSON is a subset of the JavaScript object and array literal syntax, any JSON object is a
valid function argument. By simply wrapping a function invocation around the JSON object,
we've converted the mqlread response into a form suitable for use with a <script> tag.

Note that Metaweb.read() uses JSON.serialize() to serialize the query object into JSON form.
This utility function is defined in Example A.1 in Appendix A. The corresponding JSON.parse()
function is not required, however, since the JavaScript interpreter that processes the <script>
tag serves as our JSON parser.

Example 4.8. metaweb.js: Metaweb queries with script tags

/**
* metaweb.js:
*
* This file implements a Metaweb.read() utility function using a <script>
* tag to generate the HTTP request and the URL callback parameter to
* route the response to a specified JavaScript function.
**/
var Metaweb = {}; // Define our namespace
Metaweb.HOST = "http://www.freebase.com"; // The Metaweb server
Metaweb.QUERY_SERVICE = "/api/service/mqlread"; // The service on that server
Metaweb.counter = 0; // For unique function names

// Send query q to Metaweb, and pass the result asynchronously to function f


Metaweb.read = function(q, f) {
// Define a unique function name
var callbackName = "_" + Metaweb.counter++

// Create a function by that name in the Metaweb namespace.


// This function expects to be passed the outer query envelope.
// If the query fails, this function throws an exception. Since it
// is invoked asynchronously, we can't catch the exception, but it serves

80 Developing Metaweb-Enabled Web Applications


// to report the error to the JavaScript console.
Metaweb[callbackName] = function(outerEnvelope) {
var innerEnvelope = outerEnvelope.qname; // Open outer envelope
// Make sure the query was successful.
if (innerEnvelope.status != "/mql/status/ok") { // Check for errors
var error = innerEnvelope.messages[0] // Get error message
throw error.status + ": " + error.message // And throw it!
}
var result = innerEnvelope.result; // Get result from inner envelope
document.body.removeChild(script); // Clean up <script> tag
delete Metaweb[callbackName]; // Delete this function
f(result); // Pass result to user function
};

// Put the query in inner and outer envelopes


envelope = {qname: {query: q}}

// Serialize and encode the query object


var querytext = encodeURIComponent(JSON.serialize(envelope));

// Build the URL using encoded query text and the callback name
var url = Metaweb.HOST + Metaweb.QUERY_SERVICE +
"?queries=" + querytext + "&callback=Metaweb." + callbackName

// Create a script tag, set its src attribute and add it to the document
// This triggers the HTTP request and submits the query
var script = document.createElement("script");
script.src = url
document.body.appendChild(script);
};

You'll find a proxy-based implementation of this same Metaweb.read() function in Example A.2
in Appendix A.

4.6. mqlread Errors


A number of things can go wrong when using mqlread. If you invoke it with incorrect URL
parameters or supply a query envelope that is not valid JSON, mqlread will response with an
HTTP "400 Bad Request" error. The body of the response will be a JSON object that provides
details about the error. It might look like this:

{
"status": "400 Bad request",
"messages": [
{
"text": "JSON parse error: Expecting property name at line 1 column 1",
"type": "/service/error/invalid_value",
"field": "queries",
"value": "{<}\r\n",
"level": "error"

Chapter 4. Metaweb Read Services 81


}
]
}

This kind of error can occur if you are cutting-and-pasting raw mqlread URLs or if you are entering
MQL queries as JSON text into a query editor application. When you write scripts that use JSON
serializers and invoke mqlread using tested code, this kind of error should not occur. Errors are
still possible, however: a MQL query can be invalid, even if it is expressed using well-formed
JSON and passed to mqlread using the correct URL parameters.

If you submit an invalid MQL query, mqlread returns an HTTP error code of "200 OK", but the
status property of the inner response envelope is "/mql/status/error". The inner response
envelope also includes a property named messages instead of a property named result. The
value of the messages property is an array (usually of length 1) of message objects each of which
has the following properties:

type A string that indicates what kind of message this is. For error messages, this is always
"/mql/error".

status An identifier that names the specific kind of error. /mql/status/parse_error and
/mql/status/type_error are typical values.

Note that the status property of a message object is distinct from, and more inform-
ative than, the status property of the inner response envelope.

message A human-readable description of the error

info An object that provides additional details about the error. For type errors, for example,
the properties of this object specify the value and type that appeared in the query
and the type that was expected.

query A copy of the query object with the addition of a special error_inside property,
to indicate where error occurs. For parse errors, this property is omitted, since the
query couldn't be property parsed.

path A string that specifies the "path" of property names from the root of the MQL query
to the the location of the error. If the error is in the outermost object of the query,
then this property is just an empty string. For parse errors, this property is omitted.

4.7. mqlread Cursors


When a MQL query is to be submitted to mqlread, it is placed inside an inner query envelope
object, as the value of a property named query. Often, this is the only property of the inner query
envelope. But a property named cursor is also allowed.

Use a cursor when you want to retrieve results in batches from a large result set. Start by including
this property in your inner query envelope:

cursor: true

82 Developing Metaweb-Enabled Web Applications


When you do this, mqlread will include a cursor property in the inner envelope of its response.
If the value of the response cursor property is false, then mqlread has returned the complete
set of query results to you. If the cursor property is not false, then it will be a string containing
opaque data. Take the value of this cursor property, insert it back into your inner query envelope,
and send the query back to mqlread. mqlread will send you the next batch of results and will
again include a cursor property in the inner response envelope. Repeat this process until the
cursor property of the response is false.

Example 4.9 is a metaweb.readall() function written in Python. It works like the metaweb.read()
function of Example 4.3, but uses a cursor to iterate through a large result set, making multiple
queries and concatenating the results into a single array before returning them. (Note that this
function doesn't allow any kind of parallelism: it does not allow the first batch of results to be
processed while the second batch is being fetched, for example. So if you're using a limit directive
and cursors to improve response time, this readall() methods is not appropriate.)

Example 4.9. metaweb.py: querying Metaweb with a cursor, in Python

import urllib # URL encoding


import urllib2 # Higher-level URL content fetching
import simplejson # JSON serialization and parsing

host = 'www.freebase.com' # The Metaweb host


readservice = '/api/service/mqlread' # Path to mqlread service

# Submit the MQL query q and return the result as a Python object
# This function behaves like read() above, but uses cursors so that
# it works even for very large result sets
def readall(q, credentials=None):
# This is the start of the mqlread URL.
# We just need to append the envelope to it
urlprefix = 'http://%s%s?queries=' % (host, readservice)

# The query and most of the envelope are constant. We just need to append
# the encoded cursor value and some closing braces to this prefix string
jsonq = simplejson.dumps(q);
envelopeprefix = urllib.quote_plus('{"q0":{"query":'+jsonq+',"cursor":')

cursor = 'true' # This is the initial value of the cursor


results = [] # We accumulate results in this array

# Loop until mqlread tells us there are no more results


while cursor:
# append the cursor and the closing braces to the envelope
envelope = envelopeprefix + urllib.quote_plus(cursor + '}}')
# append the envelope to the URL
url = urlprefix + envelope

# Begin an HTTP request for the URL


req = urllib2.Request(url)

# Send our authentication credentials, if any, as a cookie.

Chapter 4. Metaweb Read Services 83


# The need for mqlread authentication is a temporary restriction.
if credentials:
req.add_header('Cookie', credentials)

# Read and parse the URL contents


f = urllib2.urlopen(req) # Open URL
response = simplejson.load(f) # Parse JSON response
inner = response['q0'] # Get inner envelope from outer

# Raise a MQLError if there were errors


if inner['status'] != '/mql/status/ok':
error = inner['messages'][0]
raise MQLError('%s: %s' % (error['status'], error['message']))

# Append this batch of results to the main array of results.


results.extend(inner['result']);

# Finally, get the new value of the cursor for the next iteration
cursor = inner['cursor']
if cursor: # If it is not false, put it
cursor = '"' + cursor + '"' # in quotes as a JSON string

# Now that we're done with the loop, return the results array
return results

It is important to understand that cursors only work when multiple results are expected at the
top-level of the query. The cursor property is part of the mqlread envelope syntax, not part of
the MQL query language, and it cannot be applied to sub-queries of a query. Another way to say
this is that it only makes sense to include "cursor":true in an envelope if the first character
following "query": in the envelope is [. The query must be expressed as an array in order for a
cursor to be meaningful.

Consider the code in Example 4.4. It contains this query:

query = { 'type': '/music/artist', # Our MQL query in Python


'name': band,
'album': [{ 'name': None, # None is Python's null
'release_date': None,
'sort': 'release_date' }]}

This is a perfectly valid query, and works just fine in Example 4.4. But suppose we wanted to
port that script to use the metaweb.readall() function defined above. To do this, we'd also have
to alter the query so that the array of albums was at the top level of the query:

query = [{'type': '/music/album',


'artist': band,
'name': None,
'release_date': None,
'sort': 'release_date',
'limit': 10}]

84 Developing Metaweb-Enabled Web Applications


Note that we've added an explicit limit directive to this modified query. In general, it makes
sense to specify an explicit limit when using cursors.

4.8. Fetching Content with trans


As explained in Chapter 2, Metaweb is really two databases in one. One database is the graph of
nodes and relationships. The second is the content store that holds chunks of data such as HTML
documents and graphical images. We use mqlread service to retrieve data from the graph, and
we use the trans service to retrieve content from the content store.

The trans service is so named because in addition to fetching the requested data, it can also
translate it for you. For example, it can "translate" an image to thumbnail size.

The trans service is HTTP based, just as mqlread is. Content is retrieved by specifying the desired
translation and the content id, with a URL of this form:

http://www.freebase.com/api/trans/translation/guid

Here, for example, is an actual trans URL at freebase.com:

http://www.freebase.com/api/trans/raw/%239202a8c04000641f8000000003c1978c

The translation portion of a trans URL must be one of the following:

raw Use raw to request that no translation is to be done on the data: it should be
returned as is. (Note, however that HTML content is not completely raw: it
is "sanitized" by stripping executable content such as JavaScript.)

image_thumb Use image_thumb to request a thumbnail-sized version of an image.

blurb Use blurb to request an excerpt from the beginning of a document. This
provides a kind of a preview, of the kind you might see in a list of search
results.

The path component that follows the translation is the URL-encoded version of a Metaweb guid.
%23 is the encoding of the # character, and the letters and digits that follow are the hexadecimal
digits of the guid. The guid passed to trans must identify an object of type /type/content,
/common/image or /common/document. These three types are closely related:

/type/content A /type/content object is the representation in the Metaweb graph


of an entry in the Metaweb content store.

/common/image When an image is added to the content store, the /type/content object
for the image is co-typed /common/image, in order to add a size
property that supplies the image dimensions. For images, therefore,
the guid of the /type/content and /common/image objects are the
same.

/common/document When document content is added to the content store, a /type/content


object is created to represent the entry in the content store. A separate

Chapter 4. Metaweb Read Services 85


/common/document object is also created. The content property of the
document object refers to the content object. Other properties of the
/common/document object provide additional meta-information about
the document.

/common/document objects can also represent Wikipedia document


content (which is not stored in the Metaweb content store). Documents
that represent Wikipedia entries have content properties of null.

Given the guid of a document object, the trans service returns the
content of both Wikipedia and non-Wikipedia documents. For non-
Wikipedia documents, you can use either the guid of the /common/doc-
ument object or of the /type/content object it refers to.

The trans service does not support a callback parameter as the mqlread service does, so you
cannot use it with <script> tags. If you implement a proxy on your own web server, then you
can invoke the trans service indirectly to retrieve content with XMLHttpRequest, however.

It is usually easier, however, to use the trans service with <img> and <iframe> tags. To retrieve
and display an image, simply use a trans URL as the src attribute of an <img> tag. And to retrieve
and display the HTML content of a document, use a trans URL as the src attribute of an <iframe>.

Like mqlread, the trans service requires cookie-based authentication during the freebase.com
roll-out period. The examples in this chapter assume that you are using the trans service in a web
browser that has visited and logged on to www.freebase.com.

HTML Content, <iframe> tags, and Security

When you display Metaweb content in an <iframe>, the origin of the framed content is
different from the origin of your web application. This means that the same-origin security-
rules apply. The user of your web application can see the framed Metaweb content, but
JavaScript code in your application cannot access this content.

Metaweb strips executable code from HTML before returning it to you, but if any JavaScript
code were somehow to make it past Metaweb's sanitizer, that code would also be subject
to the same-origin policy and would be unable to interact with your web application content.
In this way, using an <iframe> for content fetched with trans gives you an extra layer of
security.

4.8.1. Browsing Recent Content on freebase.com


Example 4.10 is a JavaScript-based example that demonstrates the use of the trans service, and
the raw, image_thumb, and blurb translations. It uses mqlread to find the ten images and ten
documents most recently added to freebase.com. It then generates <img> and <iframe> tags with
trans URLs to display thumbnails for the images and blurbs for the documents. It also generates
hyperlinks so that the thumbnails and blurbs are linked to full-sized versions of the images and
documents. (These links open new windows to display the image or document.)

86 Developing Metaweb-Enabled Web Applications


Example 4.10 does not use the Metaweb.read() utility function developed earlier in the chapter.
Instead, it defines a variant of that function called sendQueries(). This sendQueries() function
sends multiple queries, in multiple inner envelopes bundled together into a single outer envelope.
This means that the example can ask for recent images and recent documents in a single invocation
of mqlread. If the example had used Metaweb.read(), it would have had to invoke mqlread twice.
Remember that you must log on to www.freebase.com before using this example.

Example 4.10. WhatsNew.html: fetching new images and documents from freebase.com

<head>
<script src="json.js"></script>
<script>
// These are a few important constants
var HOST = "http://www.freebase.com";
var READ = "/api/service/mqlread";
var RAW = "/api/trans/raw/";
var THUMB = "/api/trans/image_thumb/";
var BLURB = "/api/trans/blurb/";

/**
* Send the queries named in the outer envelope object to Metaweb,
* and pass the outer response envelope to the function f. This is a
* variant of the Metaweb.read() function that runs multiple queries.
*/
function sendQueries(queryEnvelope, f) {
// Define a unique function name
var callbackName = "_" + sendQueries.counter++

// Create a function by that name, using sendQueries as a namespace.


// This function expects to be passed the response to the query
sendQueries[callbackName] = function(responseEnvelope) {
document.body.removeChild(script); // Remove <script> tag
delete sendQueries[callbackName]; // Delete this function
f(responseEnvelope); // Pass response to user function
};

// Serialize and encode the query object


var queries = encodeURIComponent(JSON.serialize(queryEnvelope));

// Build the URL using encoded query text and the callback name
var url = HOST + READ + "?queries=" + queries +
"&callback=sendQueries." + callbackName

// Create a script tag, set its src attribute and add it to the document
// This triggers the HTTP request and submits the query
var script = document.createElement("script");
script.src = url
document.body.appendChild(script);
};
sendQueries.counter = 0; // Initialize the counter

Chapter 4. Metaweb Read Services 87


// How many images and how many documents do we display?
var N = 10; // This is the default
if (window.location.search.substring(0,3) == "?n=") // URL argument overrides
N = parseInt(window.location.search.substring(3));

// These are the queries we issue to find the n newest images and documents
var queries = {
images: {
query: [{
type:"/common/image", id:null, // Return image ids
timestamp:null, sort:"-timestamp", // Most recent first
limit:N, // Only N of them
"/type/content/media_type":null, // Check image type, too
"/type/content/media_type|=":[ // We only want images that are:
"/media_type/image/gif", // GIF or
"/media_type/image/png", // PNG or
"/media_type/image/jpeg" // JPEG
]
}]
},
docs: {
query: [{
type:"/common/document", id:null, // Return document ids
timestamp:null, sort:"-timestamp", // Most recent first
limit:N // Only N of them
}]
}
};

// When the document has loaded, send the queries above to freebase.com.
// Then call the function below with the results
window.onload = function() { sendQueries(queries, displayResults) }

// This function gets called with our query results


function displayResults(response) {
// First, display image thumbnails
var images = response.images.result; // Array of images
var container=document.getElementById("newimages"); // Where they go
for(var i = 0; i < images.length; i++) { // Loop through them
var id = encodeURIComponent(images[i].id); // Image id in URL form

var thumbnail = document.createElement("img"); // Create <img> tag


thumbnail.src = HOST + THUMB + id; // url for image thumbnail
thumbnail.title = images[i].timestamp; // timestamp as tooltip

var link = document.createElement("a"); // Hyperlink for image


link.href = HOST + RAW + id; // to a full-size image
link.target = "_new"; // displayed in a new window

link.appendChild(thumbnail); // Put thumbnail inside link


container.appendChild(link); // Put link inside container

88 Developing Metaweb-Enabled Web Applications


}

// Next display document blurbs


var docs = response.docs.result; // Array of documents
container = document.getElementById("newdocs"); // Where they go
for(var i = 0; i < docs.length; i++) { // Loop through them
var id = encodeURIComponent(docs[i].id); // Doc id in URL form
var blurb = document.createElement("iframe"); // Create an iframe
blurb.src = HOST + BLURB + id; // to hold doc blurb
var link = document.createElement("a"); // Hyperlink
link.href = HOST + RAW + id; // To full document
link.target = "_new"; // In a new window
link.innerHTML = docs[i].timestamp; // Timestamp as link text
var listitem = document.createElement("li"); // Create list item
listitem.appendChild(blurb); // Put blurb in item
listitem.appendChild(link); // Put link in item
container.appendChild(listitem); // Put item in container
}
}
</script>
<style> /* Make it all look good with a stylesheet */
img { margin: 5px;}
iframe { width: 70%; height: 75px; vertical-align: top;}
li a { vertical-align: bottom; }
h2 { margin-bottom: 5px; }
</style>
</head>
<body><!--Static document body. Thumbnails and blurbs dynamically inserted-->
<h2>The Newest Images</h2><i>Click thumbnail for full-size image</i>
<div id="newimages"><!-- thumbnails will go here --></div>
<h2>The Newest Documents</h2><i>Click timestamp for full document</i>
<ol id="newdocs"><!-- document blurbs go here --></ol>
</body>

4.9. Example: A Metaweb Type Browser


This chapter concludes with one final 1 example. Example 4.11 is a JavaScript-based web applic-
ation for browsing Metaweb types. Figure 4.2 shows a sample page that displays information
about the type /type/type. Clicking on the id of another type (or typing a type id in the upper
right) displays information about that type. You may actually find this type browser quite useful
for exploring Metaweb system types and the types in other domains. Remember, though, that
you must log on to www.freebase.com before using the example.

1
If you want more, Section A.3 is a JavaScript-based example that demonstrates Metaweb-powered autocompletion for HTML text
fields.

Chapter 4. Metaweb Read Services 89


Figure 4.2. A Metaweb type browser

This example is notable because it uses a more complicated query than the other queries in this
chapter. Example 4.11 uses the result data to generate a page of information about the specified
type. This example is also notable because its HTML output is more complex than previous ex-
amples. The code is well-commented, and if you've understood previous JavaScript examples,
you should not have trouble following this one.

Example 4.11. TypeBrowser.html: a Metaweb type browser

<html>
<head>
<!-- These are the modules we need -->
<script language="javascript" src="json.js"></script>
<script language="javascript" src="metaweb.js"></script>
<script language="javascript">
// This is the query we need to get information about a type.
// Note that we have to fill in the type we're interested in
// before sending this query.
var query = {
type:"/type/type", // The type of our type is /type/type :-)
id:null, // The type we're asking about. Filled in below.
name:null, // What is the human-readable type name?
// Objects with documentation are co-typed /freebase/documented_object
// Here we ask for a short description of the type
"/freebase/documented_object/tip":null,

90 Developing Metaweb-Enabled Web Applications


properties:[{ // What properties does this type have?
optional:true,
name:null,
key:[],
expected_type: {name:null, id:null},
unique:null
}],
expected_by:[{ // What properties are of this type?
optional:true,
name:null,
key:[],
schema: {name:null, id:null}
}],
instance:[{ // What are some instances of this type?
optional:true,
id:null,
name:null
}]
};

// When we're first loaded, display /common/topic, or the type


// specified by the ?t= argument in the URL
window.onload = function() {
var type = "/common/topic";
var search = window.location.search;
if (search && search.indexOf("?t=") == 0)
type = decodeURIComponent(search.substring(3));
queryType(type);
}

// Query the specified type. Call displayType() when the results arrive
function queryType(type) {
query.id = type; // Specify the type in the query above
Metaweb.read(query, // Issue the query
displayType); // Pass result object to displayType
}

// Generate a page of information based on our query results.


function displayType(result) {
// DOMStream is a helper class defined below
// We use it here to output HTML text to the placeholder element
var out = new DOMStream("placeholder");
out.clear()

// If we didn't get any results then the input was invalid


if (!result) {
out.write("No such type");
out.flush();
return;
}

Chapter 4. Metaweb Read Services 91


// Now begin generating information about the type
out.write("<h1>", result.id, "</h1>"); // Title
out.write("<h2>Name</h2>", result.name); // Section
out.write("<h2>Description</h2>"); // Another section
var tip = result["/common/documented_object/tip"];
if (tip) out.write(tip);
else out.write("No description available");

// Display a table of properties


out.write("<h2>Properties</h2>")
if (result.properties.length == 0) out.write("No properties");
else {
out.write('<table border=1><tr>',
'<th>Property Name</th>',
'<th>Property Key</th>',
'<th>Property Type</th></tr>');

for(var i = 0; i < result.properties.length; i++) {


out.write('<tr><td>', result.properties[i].name,
'</td><td>', result.properties[i].key.join(", "),
'</td><td>');
if (result.properties[i].unique) out.write("unique ");
displayTypeLink(out, result.properties[i].expected_type.id,
result.properties[i].expected_type.name);
out.write('</td></tr>');
}
out.write("</table>");
}

// Display the properties of other types that use this type


out.write("<h2>Used by</h2>")
if (result.expected_by.length == 0)
out.write("There are no Properties of this type.");
else {
out.write('<table border=1><tr>',
'<th>Type</th>',
'<th>Property Key</th>',
'<th>Property Name</th>',
'</tr>');

for(var i = 0; i < result.expected_by.length; i++) {


out.write('<tr><td>');
displayTypeLink(out, result.expected_by[i].schema.id,
result.expected_by[i].schema.name);
out.write('</td><td>', result.expected_by[i].key.join(", "),
'</td><td>', result.expected_by[i].name,
'</td><tr>');
}
out.write("</table>");
}

92 Developing Metaweb-Enabled Web Applications


// Output a list of the names of instances of this type
out.write("<h2>Instances</h2>");
if (result.instance.length == 0) out.write("No instances");
else {
for(var i = 0; i < result.instance.length; i++)
out.write(result.instance[i].name, ", ");
}

// Calling flush makes the output visible on the page


out.flush();
}

// Output a link to a type. Use the type id as the link text, and
// make the type name available as a tooltip
function displayTypeLink(out, id, name) {
out.write('<a title="', name, '" onclick="queryType(\'', id, '\')">',
id, '</a>');
}

// This little DOMStream class writes HTML into the element we specify
function DOMStream(id) { // Constructor function
this.elt = document.getElementById(id);
this.buffer = [];
}
DOMStream.prototype.clear = function() { // Erase element content
this.elt.innerHTML = "";
};
DOMStream.prototype.write = function() { // Buffer up all arguments
this.buffer.push.apply(this.buffer, arguments);
};
DOMStream.prototype.flush = function() { // Output all text to the element
this.elt.innerHTML += this.buffer.join("");
this.buffer.length = 0;
};
</script>

<style>
/* Some CSS styles to make everything look good */
body {
font-family: Arial, Helvetica, sans-serif; /* We like sans-serif */
margin-left: .5in; /* Indent everything... */
}
h1, h2 { margin-left: -.25in; } /* ...except headings */
h2 { margin-bottom: 5px; margin-top:10px; }
/* Make tables look nice */
table { border-collapse: collapse; width: 95%;}
th { background-color: #aaa;}
td { background-color: #ddd; padding: 1px 5px 1px 5px; }

/* Our <a> tags don't have hrefs, so we need to style them ourselves */

Chapter 4. Metaweb Read Services 93


a { color: #00a; }
a:hover { text-decoration:underline; cursor:pointer;}

/* Make the input field look nice */


form.inputform {
float:right; border: solid black 2px; background-color: #aba;
margin: 15px 30px 0px 0px; padding: 10px;
}
</style>
</head>
<body>
<!-- A form in which the user can enter a type id -->
<form class='inputform' onsubmit="queryType(this.t.value); return false;">
Enter type id: <input name="t"></form>
<!-- Generated content goes here -->
<div id="placeholder"></div>
</body>
</html>

94 Developing Metaweb-Enabled Web Applications


Chapter 5. The MQL Write Grammar
Insertions, deletions and updates to a Metaweb database are expressed in a variant of the Metaweb
Query Language documented in Chapter 3. The variant used for writing to Metaweb is known
as the MQL write grammar, and is the subject of this chapter. The chapter begins with a long
tutorial introduction to MQL writes, and then specifies the MQL write grammar more formally.
Write queries are submitted to Metaweb via the mqlwrite service, which is covered in Chapter 6.

5.1. MQL Write Tutorial


MQL writes are represented as JSON objects, just as MQL reads are. A number of features of
the MQL read grammar only make sense for reads and are not allowed in MQL writes. These
include the use of [] to query an array of values, and the use of the sort, limit and optional
directives. MQL write grammar supports two directives that are not allowed for reads. The create
directive is used to create a new object in the database, and the connect directive is used to create
a link between two objects. (As we'll see in the tutorial, however, the connect directive is
sometimes implicit and need not be specified explicitly).

Using this Tutorial

The best way to follow this tutorial is to try out the queries as you read about them. While
you learn how to make MQL writes, please use the freebase.com sandbox server at ht-
tp://sandbox.freebase.com. This server is intended for experimentation. The sandbox hosts
a replica of freebase.com, and this replica is re-created approximately once a week. This
means that any writes you perform (or mistakes you make!) on the sandbox will not persist
longer than a week.

Anyone can read data from Freebase, but before you can execute write queries, you must
register for an account and login. If you already have an account at www.freebase.com, it
may already have been replicated on the sandbox server. If not, you can create a new account
for yourself on the sandbox. Follow the links from the http://sandbox.freebase.com/
homepage to register. (If you used an invitation code when you registered at www.free-
base.com, just reuse it if you need to register on the sandbox.freebase.com.)

Once you are logged on to the sandbox, you can execute MQL write queries using the the
Freebase query editor at http://sandbox.freebase.com/view/queryeditor/. Enter queries from
this tutorial, click the write button, and view the results. Just as with reads, you must place
your MQL write queries in an "envelope" object. So instead of entering queries as they are
written in this chapter, you must prefix them with {"query": and end them with an extra
closing }.

5.1.1. Creating a Type to Work With


Before we do any explicit MQL writes, let's begin by creating a simple type to work with. By
creating and using your own type, you guarantee that the writes you try while working through
this tutorial won't interact with writes being issued by other developers who may be working on

95
the tutorial at the same time. As you know, Metaweb types are defined by regular Metaweb objects
in the database. This means that types are created like any other objects, with MQL queries. De-
fining a type with raw MQL is difficult and error prone, however, so just about everyone defines
types using the freebase.com client.

The type we're creating will represent musical notes, and we'll call it "note". In order to create
it, follow the "My Freebase" link from the freebase.com home page. On the My Freebase page,
click on "Types Created", and enter the name "Note". Figure 5.1 illustrates.

Figure 5.1. Creating a new type on freebase.com

That's all you need to do for now. We'll add some properties to this type later, but now we just
need the type itself. If you click on the name of the newly created type (note that the freebase.com
client capitalizes the name for you) and look at the URL that it takes you to, you'll see that the
name of the new type is /user/username/default_domain/note, where "username" is the user-
name you logged in with. In the tutorial that follows, you'll see the username docs, but you should
substitute your own name throughout.

96 Developing Metaweb-Enabled Web Applications


5.1.2. Creating Objects
Let's begin with a very simple write query:

Write Result
{ {
"create":"unless_exists", "create":"created",
"type": "type":
"/user/docs/default_domain/note", "/user/docs/default_domain/note",
"name":"A", "name":"A",
"id":null "id":"#1f8000000000037ffc"
} }

The first line of the query says that we want to create a new object, unless a matching object
already exists. The second line specifies the type of the object we're creating (remember to sub-
stitute your own user name for "docs" here). The third line specifies a value for the name property
of the new object. The fourth line of the write query is a request for the id of the newly created
object. Asking for an id is the only way you are allowed to use null in a write query. You may
not use null or [] for any other property.

Now let's look at the response to the write query. The first line is the create property, but its
value has changed from unless_exists to created. This tells us that the object we specified did
not already exist, and Metaweb has created it for us. The second and third lines simply repeat
the type and name properties that we passed in. They don't provide any new information, but
maintain the MQL invariant that responses have the same properties as queries. Finally, the fourth
line returns the id of the newly created object.

Metaweb IDs in this Tutorial

The guids used in this tutorial were created on the sandbox server, and are no longer valid,
so you should not try to query these objects directly. Instead, substitute your own user name
into the write queries, and create your own objects, with their own guids, as you follow
along with this tutorial. If you are reading a printed or PDF version of this chapter, note
that the guids have been shortened so that they fit more easily in two-column format.

Now let's see what happens if we run exactly the same query again:

Write Result
{ {
"create":"unless_exists", "create":"existed",
"type": "type":
"/user/docs/default_domain/note", "/user/docs/default_domain/note",
"name":"A", "name":"A",
"id":null "id":"#1f8000000000037ffc"
} }

Chapter 5. The MQL Write Grammar 97


We're asking that an object be created unless it already exists. And this time it does already exist.
So Metaweb returns the existed as the value of the create property, and returns the id of the
already existing object. Note that this id is the same as the one we've already seen.

Now let's force Metaweb to create another new test object for us:

Write Result
{ {
"create":"unconditional", "create":"created",
"type": "type":
"/user/docs/default_domain/note", "/user/docs/default_domain/note",
"name":"A", "name":"A",
"id":null "id":"#1f800000000003800f"
} }

In this query, we've changed the value of the create directive to unconditional. As its name
implies, this value tells Metaweb to create a new object no matter what. Since a new object is
created unconditionally, the value of the create property in the response will always be created.
You can see that a new object was created by comparing the id returned by this query to those
returned by the previous two queries.

When to use create:unconditional

It is almost never necessary or correct to use "create":"unconditional" in a MQL write


query. Most writes use "create":"unless_exists", and many others use "create":"un-
less_connected", which has not been introduced yet. Using "unconditional" leads to du-
plicate objects, and as we'll see below, this can get you into trouble!

We now have two note objects with the name "A". What happens if we run the original unless_ex-
ists write again?

{
"status": "400 Bad request",
"messages": [
{
"query": {
"create": "unless_exists",
"type": "/user/docs/default_domain/note",
"name": "A",
"error_inside": ".",
"id": null
},
"text": "Need a unique result to attach here, not 2",
"args": {
"count": 2,
"guids": [
"#1f8000000000037ffc",
"#1f800000000003800f"

98 Developing Metaweb-Enabled Web Applications


]
},
"type": "/service/error/result",
"level": "error"
}
],
"result": {
"q": null
}
}

The query fails this time, and returns the JSON object shown above. The "create":"unless_ex-
ists" directive works only if there are 0 or 1 instances of the object. If there is no object that
matches, it creates one. If there is one object that matches, it returns it. But if there are more than
one, it has no way to choose which one to return, and fails with an error message. Note that the
query fails even if we omit "id":null. The lesson here is that if you plan to use unless_exists,
you should use it consistently so you never end up with more than one instance of an object.

5.1.3. Connecting Objects


So far we've created two distinct objects with identical types and names. Let's now rename one
so we can tell them apart by name. Recall that an object is named by linking it to a primitive
value of /type/text. We want to update the name link to refer to a different value:

Write Result
{ {
"id":"#1f800000000003800f", "id":"#1f800000000003800f",
"name":{ "name":{
"connect":"update", "connect":"updated",
"value":"B", "value":"B",
"lang":"/lang/en" "lang":"/lang/en"
} }
} }

The first line of the query identifies, by id, the object we want to modify. The second and third
lines specify that want to update the name property of that object so that it refers to the /type/text
value specified by the 4th and 5th lines. (Recall that /type/text is a primitive value that consists
of a string of text and a language identifier for that text. MQL write queries require you to specify
both the value and lang properties when manipulating a name.)

The response looks just like the query except that the value of the connect property has changed
to updated. This tells us that the update we requested has been performed.

What happens if we run exactly the same write query again?

Chapter 5. The MQL Write Grammar 99


Write Result
{ {
"id":"#1f800000000003800f", "id":"#1f800000000003800f",
"name":{ "name":{
"connect":"update", "connect":"present",
"value":"B", "value":"B",
"lang":"/lang/en" "lang":"/lang/en"
} }
} }

We're asking to make a change that has already been made, and Metaweb lets us know this by
setting the connect property of the response to present.

We now have two newly-created objects with the same type and different names. We changed
the name of the second object by updating a /type/text value. /type/text is a primitive type
in Metaweb, so this isn't quite the same thing as a link between two different objects in the data-
base. Now, let's modify the first object (the note A) so that it is a /common/topic in addition to
being a note:

Write Result
{ {
"id":"#1f8000000000037ffc", "id":"#1f8000000000037ffc",
"type":{ "type":{
"connect":"insert", "connect":"inserted",
"id":"/common/topic" "id":"/common/topic"
} }
} }

The first line of the query specifies the object to be modified. (Normally, we'd identify the object
by name and type, but we can't specify the type and add a type in the same query. The name "A"
is probably not unique by itself, so we specify the object we want to modify by id.) The second
and third lines specify that we want to insert a new connection between this object and another
object, and that this new connection should use the type property. The fourth line specifies, by
id, the object that is being connected to.

Note that the value of the connect directive is insert instead of update, which is what we used
above. The difference between the two is simple. Use "connect":"update" for properties that
have a unique value (and for the name property, which is unique on a per-language basis). Use
"connect":"insert" for properties, such as type, that can have more than one value. You are
also allowed to use "connect":"insert" with unique properties if there is not already a value
for that property.

The response object sets the value of the connect directive to inserted, telling us that the insertion
was successful. Our note named "A" is now also a /common/topic. If you visit your "My Freebase"
page, the note object should now be visible under the "Topics Created" heading. This is the main
reason to use the /common/topic type on the objects you create: it allows them to work well with
the freebase.com client.

100 Developing Metaweb-Enabled Web Applications


What happens if we run the same query again?

Write Result
{ {
"id":"#1f8000000000037ffc", "id":"#1f8000000000037ffc",
"type":{ "type":{
"connect":"insert", "connect":"present",
"id":"/common/topic" "id":"/common/topic"
} }
} }

We're asking to insert /common/topic into a set of types that already includes /common/topic,
and we get the response present. It tells us that this value is already in the set and that nothing
has changed. (Non-unique properties in Metaweb are like sets: they do not allow duplicates.)

Let's do a quick read query to confirm that our object is a member of two types:

Read Result
{ {
"id":"#1f8000000000037ffc", "id":"#1f8000000000037ffc",
"type":[] "type":[
} "/user/docs/default_domain/note",
"/common/topic"
]
}

So we see that our object is, in fact, a note and a topic.

Chapter 5. The MQL Write Grammar 101


5.1.4. Disconnecting Objects
We've seen that Metaweb allows us to connect objects with "connect":"insert" or "con-
nect":"update". To disconnect objects, use "connect":"delete". Let's alter the object that
represents the note A again, to remove /common/topic from its set of types:

Write Result
{ {
"id":"#1f8000000000037ffc", "id":"#1f8000000000037ffc",
"type":{ "type":{
"connect":"delete", "connect":"deleted",
"id":"/common/topic" "id":"/common/topic"
} },
} }

This query looks just like the query we used to add the type, except that we've changed "insert"
to "delete". And Metaweb's response looks just like the response to the insertion, except that
"inserted" has changed to "deleted". You can verify that the object is no longer a /common/topic
by visiting "My Freebase" on sandbox.freebase.com and noting that it no longer appears in the
"Topics Created" list.

At this point, you probably have a pretty good idea what will happen if we re-run the query:

Write Result
{ {
"id":"#1f8000000000037ffc", "id":"#1f8000000000037ffc",
"type":{ "type":{
"connect":"delete", "connect":"absent",
"id":"/common/topic" "id":"/common/topic"
} }
} }

We asked Metaweb to remove /common/topic from a set that did not contain /common/topic,
so it returned "absent" to indicate that nothing has been changed.

The MQL write grammar has no syntax for deleting objects themselves. The closest thing to de-
leting an object is to delete all connections from that object to others. If an object has no type,
no name, and no other properties of interest, then it becomes effectively unreachable, and is almost
as good as gone. Note, however, that Metaweb maintains a modification history for each object.
When you view an object in the freebase.com client, you'll see a "History" link at the bottom of
each page. Clicking this link allows you to view the change history for the object, and allows
you to undo changes, including deletions.

When an object has had all its links deleted, it can still be queried by guid or creator (Metaweb
does not allow these read-only properties to be deleted.) In practice, however, unreachable objects
will only be found by determined searchers, and their continued existence is very unlikely to affect

102 Developing Metaweb-Enabled Web Applications


the results of future queries. Unreachable objects may at some point be purged from a Metaweb
database, but their guids will never be reused.

Let's use this unlinking technique to "delete" the two note objects we've created:

Write Result
[{ [{
"id":"#1f8000000000037ffc", "id":"#1f8000000000037ffc",
"type":{ "type":{
"connect":"delete", "connect":"deleted",
"id":"/user/docs/default_domain/note" "id":"/user/docs/default_domain/note"
}, },
"name":{ "name":{
"connect":"delete", "connect":"deleted",
"value":"A", "value":"A",
"lang":"/lang/en" "lang":"/lang/en"
} }
},{ },{
"id":"#1f800000000003800f", "id":"#1f800000000003800f",
"type":{ "type":{
"connect":"delete", "connect":"deleted",
"id":"/user/docs/default_domain/note" "id":"/user/docs/default_domain/note"
}, },
"name":{ "name":{
"connect":"delete", "connect":"deleted",
"value":"B", "value":"B",
"lang":"/lang/en" "lang":"/lang/en"
} }
}] }]

Note that this write query is really two separate queries, included within square brackets. The
mqlwrite service (the topic of Chapter 6) accepts submissions of multiple writes at once. Note
that names are deleted with "connect":"delete", even though they are unique and were originally
created with "connect":"update". You must specify the lang property explicitly when deleting
a name.

As a final test, let's query the first of these objects (by id) and find out what little information it
still carries:

Read Result
{ {
"id":"#1f8000000000037ffc", "id":"#1f8000000000037ffc",
"*":null "guid":"#1f8000000000037ffc",
} "name":null,
"type":[],
"key":[],
"creator":"/user/docs",
"permission":"/boot/all_permission",
"timestamp":"2006-11-08T20:00:02.0000Z"
}

Chapter 5. The MQL Write Grammar 103


As expected, the name and types of the object are gone. All that remains are its id, creator, creation
timestamp, and permissions.

Example 6.7 in Chapter 6 is a command-line script for unlinking Metaweb objects in this way.
Its default behavior is to delete all objects you have created. You may find this script useful to
wipe your slate clean while experimenting with MQL writes.

Multiple Queries and Atomicity

When you submit multiple top-level write queries to Metaweb at the same time, it is natural
to ask whether they are executed in order, and whether a query can depend on an object
created by a previous query. The answer to both questions is no. The reason is a good one,
however: when multiple queries are submitted at the same time, they are executed atomically:
all are executed or none are executed.

In order to implement this atomic behavior, the Metaweb server first tests each query to
determine whether it will succeed. It does this without actually executing the query. If all
queries pass the test, then all are executed. Note, however, that this means that each query
must be able to succeed before any other queries have been run. Therefore, the queries
must be completely independent of each other. And since they are independent, there is
really no way to tell what order Metaweb executes them in.

5.1.5. Writes and Default Properties


Take a look again at the MQL write queries we use to create and "delete" Note objects. First, the
creation:

Write Result
{ {
"create":"unless_exists", "create":"created",
"type":"/user/docs/default_domain/note", "type":"/user/docs/default_domain/note",
"name":"C#", "name":"C#",
"id":null "id":"#1f800000000104befe"
} }

Now contrast this with the query that "deletes" the object by unlinking its type and name:

Write Result
{ {
"id":"#1f800000000104befe", "id":"#1f800000000104befe",
"type":{ "type":{
"connect":"delete", "connect":"deleted",
"id":"/user/docs/default_domain/note" "id":"/user/docs/default_domain/note"
}, },
"name":{ "name":{
"connect":"delete", "connect":"deleted",
"value":"C#", "value":"C#",
"lang":"/lang/en" "lang":"/lang/en"

104 Developing Metaweb-Enabled Web Applications


Write Result
} }
} }

The creation query is much more compact because we are able to specify the type as a single id
and the name as a single string. In the deletion query, we must specify the expanded objects.
There are three factors that interact to make the creation query shorter. First, recall from Chapter 3
that every type has a default property. For value types such as /type/text (the type of the name
property) the default property is value. For core types in the /type domain, the default property
is id. For all other types, the default property is name. So in the creation query,
"type":"/user/docs/default_domain/note" is shorthand (but see the caution below!) for:

"type": { "id":"/user/docs/default_domain/note" }

The second factor that makes the creation query so compact is the fact that when you specify a
default property rather than a full object in a MQL write query, Metaweb assumes an implicit
"connect":"insert". So writing "type":"/user/docs/default_domain/note" is kind of (but
not exactly: see the caution that follows) like writing:

"type": {
"connect":"insert",
"id":"/user/docs/default_domain/note"
}

The third factor that makes the creation query compact is that the language of /type/text values
is automatically set to the default of English, or to your preferred language as specified by a
parameter to the mqlwrite service. (See Chapter 6 for details.)

All three factors come into play when we write "name":"C#". "C#" becomes the value of the
default property, which is the value. An implicit "connect":"insert" is added. And a lang
property is added to specify /lang/en, or whatever language we are using. So "name":"C#" ex-
pands to (but see the caution!):

"name": {
"connect":"insert",
"value":"C#",
"lang":"/lang/en"
}

Chapter 5. The MQL Write Grammar 105


Caution: unless_exists with Expanded Objects

From the explanation above, you might assume that the compact creation query with which
we began this section could be equivalently (but less compactly written) as:

{
"create":"unless_exists",
"id":null,
"name": "C#",
"type": {
"connect":"insert",
"id":"/user/docs/default_domain/note"
}
}

If the queries used "create":"unconditional" then they would be the same. But the
meaning of unless_exists is different for the two queries. The original compact query
could be translated as If you can find a Note object named "C#", return its id. Otherwise,
create a new Note object, name it "C#", and return its id.

But this variant that expands the type property is different in a subtle but important way.
It tells Metaweb: find or create an object named "C#", and then add Note to its set of types.
The difference between the two queries is critical if there is already an object (of type
/programming/language, perhaps) with the name "C#".

Here's another way to think about this. When the type is specified by id, this is a constraint
on the query. Metaweb must find an object that matches, or must construct one. When the
type is specified in a sub-query with an explicit connect directive the sub-query is not a
constraint, and does not affect the results of the unless_exists search.

5.1.6. Creating and Connecting More Objects


Let's try some more advanced examples. Before we start, though, we need to add a property to
our Note type. Here's how you do it (Figure 5.2 illustrates):

• Visit your "My Freebase" page on sandbox.freebase.com and click on the Note type under
Types Created.

• Click on the "Edit Type" link.

• Click on the Add a New Property button, enter the property name "next" into the text field
that appears, and click the Save button.

• Freebase now allows you to enter details about the property:

• Enter the type name "Note" into the Expected Type field (you may see a drop-down list
containing your version of the Note type and many other developer's versions. Select the
one that is followed by your username in parentheses.

106 Developing Metaweb-Enabled Web Applications


• Click the Restrict to one value checkbox to indicate that the property may have only a
single value.

Figure 5.2. Adding a property on freebase.com

The next property we've just added to our note type allows us to link one note to another in a
chain or a ring. We'll use this property to link each note to its perfect fifth--the note that is 7
semitones higher (usually, this is 5 white keys on a piano keyboard, which is probably why it is
called a fifth.) If we start with the note C, we find that it's fifth is the note G. Before we start using
the next property to represent fifths, however, let's run a simple query that will give us a convenient
shortcut:

Write Result
{ {
"id":"/user/docs/default_domain/note", "id":"/user/docs/default_domain/note",
"key":{ "key":{
"connect":"insert", "namespace":"/user/docs",
"namespace":"/user/docs", "connect":"inserted",
"value":"note" "value":"note"
} }
} }

This query specifies our Note type object by id, and then adds a new /type/key value to its key
property. What we've done is to make /user/docs/note a synonym for /user/docs/default_do-

Chapter 5. The MQL Write Grammar 107


main/note. You may find this a helpful shortcut as you type in the example queries that follow.
We'll explore namespaces again later in this tutorial.

Now, let's create Note objects to represent the notes C and G. Note that the following query is
two independent queries in an array:

Write Result
[{ [{
"create":"unless_exists", "create":"created",
"id":null, "id":"#1f80000000000384b0",
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"C" "name":"C"
},{ },{
"create":"unless_exists", "create":"created",
"id":null, "id":"#1f80000000000384b4",
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"G" "name":"G"
}] }]

We've asked Metaweb to create two Note objects, with names C and G, and to return their ids to
us. Now, let's insert the link that indicates that G is the fifth of C:

Write Result
{ {
"id":"#1f80000000000384b0", "id":"#1f80000000000384b0",
"/user/docs/note/next":{ "/user/docs/note/next":{
"connect":"update", "connect":"inserted",
"id":"#1f80000000000384b4" "id":"#1f80000000000384b4"
} }
} }

This compact query identifies both note objects by id and connects them with a connect directive.
Since we defined the next property to be unique, it uses "connect":"update" instead of "con-
nect":"insert". Note that since this query never specifies the type of the objects, we must use
a fully-qualified property name for the next property. You can verify that this query did what
we intended using the freebase.com client. Visit My Freebase on sandbox.freebase.com, and
click on the Note type. On the page for the Note type, you should see a list of instances of that
type. Click on the one named "C", and you'll see that it includes a hyperlink to the note G labeled
"Next".

The linking technique shown above is straightforward and easy to understand. It uses one query
to create (or look up) the two objects to be linked. Then it uses a second simple query to connect
the two objects. It is usually possible, however, to combine the creation and linking into a single
query. The following query, for example, sets the next property of the note G to a newly-created
note named D:

108 Developing Metaweb-Enabled Web Applications


Write Result
{ {
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"G", "name":"G",
"next":{ "next":{
"create":"unless_exists", "create":"created",
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"D" "name":"D"
} }
} }

Notice that there is no connect directive here. Since the create directive is nested in this query,
the connection is implicit.

Here's a longer query of the same sort:

Write Result
{ {
"create":"unless_exists", "create":"created",
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"B flat", "name":"B flat",
"next":{ "next":{
"create":"unless_exists", "create":"created",
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"F", "name":"F",
"next":{ "next":{
"create":"unless_exists", "create":"connected",
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"C" "name":"C"
} }
} }
} }

This query creates a note F and links it to the existing note C, and then creates a note B flat and
links it to the new note F. Note that the query uses "create":"unless_exists" three times. The
response includes "created" twice for the newly created notes. But for the note C, which already
exists, the response says "create":"connected". This tells us that the note C already existed,
but that a new connection has been made to it. If we rerun the query, we get "create":"existed"
all three times, since the objects and links already exist.

The following query is like the one above, but shorter, and with one important tweak:

Write Result
{ {
"create":"unless_exists", "create":"created",
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"E flat", "name":"E flat",

Chapter 5. The MQL Write Grammar 109


Write Result
"next":{ "next":{
"create":"unless_connected", "create":"created",
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"B flat" "name":"B flat"
} }
} }

This query creates a new note E flat, and connects it to B flat. Notice, however, that in the nested
clause of the query, we used a different form of the create directive: "create":"unless_connec-
ted". And in the response we have a "create":"created". If you examine the list of Note in-
stances in the freebase.com client, you'll see that there are now two of them named "B flat". If
you use unless_connected, then Metaweb looks for a matching object that is already connected.
If it cannot find one, it creates a new one and connects it. In this case, there was an existing Note
object named B flat, but it was not already connected, so the query created a new one. If we re-
run the query, however, it simply returns "create":"existed" because the object and the con-
nection exist.

Note that unless_connected only makes sense in nested clauses. If we change the outermost
unless_exists in the query above to unless_connected, Metaweb complains: Can't use 'create':
'unless_connected' at the root of the query.

When to use unless_connected

"create":"unless_connected" directive is relatively infrequently used. Use it when objects


must be unique within their "parent". One example of this is in the /music domain, where
/music/track objects are unique for each /music/album. When a band releases an album
with a hit song on it, that song is likely to end up being released again on compilation al-
bums, live albums, cover albums, and so on. In the freebase.com /music domain, however,
each song object is associated with only one album. With a schema like this, unless_con-
nected is more useful than unless_exists.

Let's clean up the extra B flat object we created:

Write Result
{ {
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"E flat", "name":"E flat",
"next":{ "next":{
"connect":"delete", "connect":"deleted",
"type":{ "type":{
"connect":"delete", "connect":"deleted",
"id":"/user/docs/note" "id":"/user/docs/note"
}, },
"name":{ "name":{
"connect":"delete", "connect":"deleted",
"value":"B flat", "value":"B flat",

110 Developing Metaweb-Enabled Web Applications


Write Result
"lang":"/lang/en" "lang":"/lang/en"
} }
} }
} }

Note that the query above does two things. It disconnects the name and type of the extra B flat
object, and also disconnects that object from E flat. Now all we have to do is connect E flat to
the valid B flat object. This should be easy for you now:

Write Result
{ {
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"E flat", "name":"E flat",
"next":{ "next":{
"connect":"insert", "type":"/user/docs/note",
"type":"/user/docs/note", "connect":"inserted",
"name":"B flat" "name":"B flat"
} }
} }

5.1.7. Review: Write Directives


At this stage of the tutorial, you've seen all the variations of the create and connect directives.
Let's do a quick review before diving in to some more advanced examples.

The create directive comes in three forms:

"create":"unconditional" Always create the specified object. It is almost never ne-


cessary or appropriate to use this form of the create dir-
ective.

"create":"unless_exists" Look for the object in the database and create a new one
if a match cannot be found.

"create":"unless_connected" Look for a matching object that already exists and is


already connected to the parent query. If no such object
exists, create and connect a new one.

The possible responses to a create directive are the following:

"create":"created" Indicates that a new object has been created. This is always the
response for unconditional directives, but may also be returned
by unless_exists and unless_connected directives.

Chapter 5. The MQL Write Grammar 111


"create":"existed" Indicates that a pre-existing match was found and no object was
created. This may be returned by unless_exists or unless_con-
nected directives.

"create":"connected" Indicates that the object already existed but a connection has
been made. This response is only possible for unless_exists
directives that are nested within a parent query.

The three forms of the connect directive are:

"connect":"insert" Use this form to attach a value or object to a non-unique property.


It can also be used to attach the first value or object to a unique
property.

"connect":"update" Use this form to attach a value or object to a unique property, repla-
cing any value or object that was previously connected.

"connect":"delete" Use this form to detach a value or object from a property. It works
for unique and non-unique properties.

There are five possible responses to a connect query:

"connect":"inserted" Indicates that an insert directive was successful.

"connect":"updated" Indicates that an update directive was successful.

"connect":"deleted" Indicates that a delete directive was successful.

"connect":"present" Indicates that an insert or update directive was unsuccessful


because the specified connection was already present.

"connect":"absent" Indicates that a delete directive was not successful because the
connection to be deleted did not exist.

5.1.8. Working with Sets


The most interesting examples we've explored so far have used the next property of our Note
type. We defined this property to be unique--so that it can have only one value. There are some
features of the MQL write grammar that only become apparent when used on non-unique prop-
erties, however. Let's define a Chord type and give it a non-unique property named note which
links to Note objects. (By convention, we use a singular property name, even though we expect
each Chord object to refer to multiple Note objects.) Create this type and its property on sand-
box.freebase.com by repeating the steps we followed to define the Note type and its next
property. Just change the names to Chord and note, and don't check the "Restrict to one value"
box.

If you appreciated not having to type default_domain/ in the examples above, you can use the
same shortcut for the new Chord type:

112 Developing Metaweb-Enabled Web Applications


{
"id":"/user/docs/default_domain/chord",
"key":{
"connect":"insert",
"namespace":"/user/docs",
"value":"chord"
}
}

Now let's define a chord using the notes C, E, and G.

Write Result
{ {
"create":"unless_exists", "create":"created",
"name":"CEG", "name":"CEG",
"type":[ "type":[
"/common/topic", "/common/topic",
"/user/docs/chord" "/user/docs/chord"
], ],
"note":[{ "note":[{
"connect":"insert", "connect":"inserted",
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"C" "name":"C"
},{ },{
"connect":"insert", "connect":"inserted",
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"G" "name":"G"
},{ },{
"create":"unless_exists", "create":"created",
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"E" "name":"E"
}] }]
} }

Several things immediately stand out about this query:

• It specifies the ids of two types within a JSON array. The created object will be both a Chord
and a Topic. (We'll say more about arrays in write queries and about the /common/topic type
below).

• It specifies three notes, as expanded objects, within a JSON array. These are the set of values
for the note property of the chord.

• Note objects C and G exist already, so this query uses "connect":"insert" for these two. We
haven't created an object to represent E yet, so the query creates and connects it with "cre-
ate":"unless_exists".

So far in this chapter, we've only seen square brackets in write queries when we were bundling
up multiple top-level queries to be submitted to Metaweb in a single batch. The MQL write
grammar is actually more general than this: nested queries can also be collected into an array,

Chapter 5. The MQL Write Grammar 113


and this allows us to connect more than one value to a property. In the case of the type property,
our query specifies two types by their id. As we discussed earlier, types can be specified by id
because id is the default property of /type/type. When types are specified this way, "con-
nect":"insert" is assumed. The reason that we specify /common/topic in addition to the Chord
type is that the freebase.com client uses topics as its organizing metaphor. Objects of type /com-
mon/topic simply work better in the client. For example, /common/topic objects you create are
listed under the heading "Topics Created" on your My Freebase page.

Multiple Types and Unqualified Property Names

When we specify more than one type for an object, we use a JSON array. But the Metaweb
object model represents the types as an unordered set, so the order in which we specify
them should not matter. In fact, however, it does. The last type in the array of types is used
to qualify any unqualified property names that are not /type/object properties.

In the query above, if we had specified /user/docs/chord first, and /common/topic second,
then Metaweb would have assumed that the unqualified note property meant /common/top-
ic/note, and this would have caused an error since there is no such property. If you don't
want to rely on the order of the types, you can just be explicit and use the fully-qualified
names of all properties, such as /user/docs/chord/note.

5.1.9. Bidirectional Links and Reciprocal Properties


One of the fundamental aspects of Metaweb is that all links between nodes are bi-directional.
Our CEG Chord node has links to the nodes that represent the notes C, E, and G. Those links are
bi-directional, which means that the C, E, and G nodes are linked to the CEG Chord node. The
links are there, but our Note type doesn't define a appropriate property that exposes those links
in the object-oriented view of the database.

Fortunately, the freebase.com client makes it very easy to define such a property:

• Go to your My Freebase page on sandbox.freebase.com home page, and click on your Note
type under User's Types.

• Click on the "View Schema" link on the page for your Note type.

• Look near the bottom of the schema page (you may have to scroll down) for the heading
Suggested Properties. You should see something like what is shown in Figure 5.3

• This tells you is that the type Chord has a property named Note. 1 The client is suggesting that
you add a reciprocal property to expose the other direction of the link. The link is already there:
all that is required is that you give this property a name so that you can refer to it.

• Since the Chord property that refers to Notes is named note, it seems sensible to name the
Note property that refers to Chords chord. Click the Edit button or double-click the double-
click to edit text message. Then type in "chord" and hit Enter or click Save.

1
The freebase.com client capitalizes type and property names: these are the "human-readable" forms: their ids are still lowercase
/user/docs/chord and /user/docs/chord/note).

114 Developing Metaweb-Enabled Web Applications


Figure 5.3. Adding a reciprocal property

You have now created the property /user/docs/note/chord, which is the reciprocal property
of /user/docs/chord/note. Since we now have a pair of properties, we can take advantage of
the bi-directional nature of the links between chords and notes.

Let's experiment with this. First, we'll query the Chord CEG to find out what notes it contains:

Read Result
{ {
"type":"/user/docs/chord", "type":"/user/docs/chord",
"name":"CEG", "name":"CEG",
"note":[] "note":["C","G","E"]
} }

This result is unsurprising, given that the /user/docs/chord/note property is the one we defined
originally. Now let's turn the query around and try out the reciprocal /user/docs/note/chord
property we've just added. What chords is the note C a part of?

Read Result
{ {
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"C", "name":"C",
"chord":[] "chord":["CEG"]
} }

The note C "knows" that it is part of the chord CEG even though we never set its chord property.
Setting a property automatically causes its reciprocal property to be set as well. Because links
are bi-directional in Metaweb, this is all automatic.

Now let's create a new chord:

Write Result
{ {
"create":"unless_exists", "create":"created",
"type":["/common/topic", "type":["/common/topic",
"/user/docs/chord"], "/user/docs/chord"],
"name":"BFG" "name":"BFG"
} }

Chapter 5. The MQL Write Grammar 115


We've created a chord named BFG, but we haven't added the notes B, F and G to it. To further
demonstrate reciprocal properties, we'll do the reverse, and add the chord to the notes:

Write Result
[{ [{
"create":"unless_exists", "create":"created",
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"B", "name":"B",
"chord": { "chord":{
"connect":"insert", "connect":"inserted",
"type":"/user/docs/chord", "type":"/user/docs/chord",
"name":"BFG" "name":"BFG"
} }
},{ },{
"create":"unless_exists", "create":"existed",
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"F", "name":"F",
"chord": { "chord":{
"connect":"insert", "connect":"inserted",
"type":"/user/docs/chord", "type":"/user/docs/chord",
"name":"BFG" "name":"BFG"
} }
},{ },{
"create":"unless_exists", "create":"existed",
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"G", "name":"G",
"chord": { "chord":{
"connect":"insert", "connect":"inserted",
"type":"/user/docs/chord", "type":"/user/docs/chord",
"name":"BFG" "name":"BFG"
} }
}] }]

This query connects the BFG chord to the chord property of the notes B, F, and G. (It also creates
the note B, which didn't exist yet.) Now let's ask BFG what notes it contains:

Read Result
{ {
"type":"/user/docs/chord", "type":"/user/docs/chord",
"name":"BFG", "name":"BFG",
"note":[] "note":["B","F","G"]
} }

Once again, we've demonstrated that we can set a property of an object by setting the reciprocal
property to refer to that object.

116 Developing Metaweb-Enabled Web Applications


5.1.9.1. How Reciprocal Properties Work
Now let's explore the properties /user/docs/chord/note and /user/docs/note/chord to find
out what makes them the reciprocal of each other. Recall that properties are themselves Metaweb
objects. First we'll query the note property of Chord:

Read Result
{ {
"type":"/type/property", "type":"/type/property",
"id":"/user/docs/chord/note", "id":"/user/docs/chord/note",
"expected_type":null, "expected_type":"/user/docs/note",
"master_property":null, "master_property":null,
"reverse_property":null "reverse_property":
} "/user/docs/default_domain/note/chord"
}

We find that the note property of Chord has an expected_type of Note, as expected. More inter-
estingly, though, we find a property named reverse_property that refers to the chord property
of the Note type. 2 So let's query that property now:

Read Result
{ {
"type":"/type/property", "type":"/type/property",
"id":"/user/docs/note/chord", "id":"/user/docs/note/chord",
"expected_type":null, "expected_type":"/user/docs/chord",
"master_property":null, "master_property":
"reverse_property":null "/user/docs/default_domain/chord/note",
} "reverse_property":null
}

This property has a reverse_property of null, but has a property named master_property that
refers back to the first property we looked at.

Reciprocal properties are linked to each other via the master_property and reverse_property
properties. When one property is set, its reciprocal, if it has one, is automatically set. The recipro-
city is symmetrical: the terms "master" and "reverse" imply a directionality or hierarchy, but the
property labeled "master" has no special status or preference over the property labeled "reverse".
(When types are created with the freebase.com clients, the property created first is the master
property.) 3

2
Note that default_domain has crept back into the result. We've created shortcuts for our types in easier-to-type namespaces, but
their original location was under /user/docs/default_domain.
3
The master_property and reverse_property properties are themselves reciprocal properties. (Verifying this with a MQL
query is left as an exercise for the reader.) This means that if you set p.reverse_property to q then q.master_property is
automatically set to p.

Chapter 5. The MQL Write Grammar 117


5.1.10. Writes and Ordered Collections
If a Metaweb property has not been declared a unique property, it may have a set of values. As
we saw in Chapter 3, these sets may be ordered, and MQL read queries can access this order
with the index keyword. This section shows how to define an ordering with a MQL write query.
Not surprisingly, this is also uses the index keyword.

In order to demonstrate how to create an ordered collection, we'll need a suitable type. Chords
don't work: the notes of a chord are played simultaneously, and no order is required. A broken
chord (or arpeggio) is a chord in which the notes are played sequentially. Since there is a sequence,
there is an order. Use the freebase.com client to define a new type named "Arpeggio" in the
sandbox. Give it a property named "note" whose expected type is Note. Arpeggio is actually just
like Chord: only the names are different. To save yourself typing, copy the type from
/user/docs/default_domain/arpeggio (use your own username) to /user/docs/arpeggio, just
as we did for the Note and Chord types:

{
"id":"/user/docs/default_domain/arpeggio",
"key":{
"connect":"insert",
"namespace":"/user/docs",
"value":"arpeggio"
}
}

Now with our type defined, let's create our first ordered collection:

Write Result
{ {
"create":"unless_exists", "create":"created",
"type":"/user/docs/arpeggio", "type":"/user/docs/arpeggio",
"name":"broken CEG", "name":"broken CEG",
"note": [{ "note":[{
"index":0, "index":0,
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"C" "name":"C"
},{ },{
"index":1, "index":1,
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"E" "name":"E"
},{ },{
"index":2, "index":2,
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"G" "name":"G"
}] }]
} }

Creating an arpeggio is much like creating a chord. Two things stand out about this query, however.
First, each note has an index associated with it, and there are no connect directives. The index

118 Developing Metaweb-Enabled Web Applications


property specifies the ordering, and does an implicit "connect":"insert". (If the object was
already inserted, then the index would simply re-order it without attempting to re-insert it.) If we
were creating the Note objects at the same time as we were inserting them into this Arpeggio
object, we would have to include both the create directive and the index property.

There are some strict rules that govern the use of the index property:

• The index property may not appear within a top-level query. Indexes don't apply to objects
but to the links between objects. The index property is used in sub-queries to specify the order
of the links between the parent object and the children.

• If there are n sibling sub-queries that specify an index, the values specified must include every
integer from 0 to n-1. You must always start with zero. You may not include duplicate indexes,
and you may not skip an index. It is not required that every element of a sub-query array have
an index. Metaweb collections can be partially ordered and partially unordered.

This second rule may seem surprisingly strict, but remember that despite the name "index", the
values we specify with the index property are not array indexes. The numbers are merely a simple
way to specify a series of less than and greater than relationships. The requirement that indexes
always run from 0 through n-1 means that there is really no way to insert an element at a given
location with an ordered collection, and no way to move an element from one spot to another.

Suppose, for example, that we want to insert the notes B and F into our CEG arpeggio at the be-
ginning so the arpeggio consists of the five sequential notes BFCEG:

Write Result
{ {
"type":"/user/docs/arpeggio", "type":"/user/docs/arpeggio",
"name":"broken CEG", "name":"broken CEG",
"note": [{ "note":[{
"create":"unless_exists", "create":"connected",
"index":0, "index":0,
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"B" "name":"B"
},{ },{
"create":"unless_exists", "create":"connected",
"index":1, "index":1,
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"F" "name":"F"
}] }]
} }

This query demonstrates that the index property can be used along with a create directive. The
notes already exist, but are not connected so we get "create":"connected" in the response.

What has this query actually done? We can use a read query to ask for the notes of the arpeggio
in order and see if we've accomplished what we wanted to. But before we do, let's consider what
information we've given Metaweb. Previously, we told Metaweb that C comes before E and E
comes before G. Now, we've also told Metaweb that B comes before F. But this isn't enough.

Chapter 5. The MQL Write Grammar 119


Metaweb does not know anything about the relationship between BF and CEG. There are many
possible orderings that meet the criteria we've specified. BFCEG is one possible ordering, but so
is CEGBF, and so is CBEFG!

Let's see what we actually get:

Read Result
{ {
"type":"/user/docs/arpeggio", "type":"/user/docs/arpeggio",
"name":"broken CEG", "name":"broken CEG",
"note":[{ "note":[
"index":null, {"index":0, "name":"B"},
"name":null, {"index":1, "name":"F"},
"sort":"index" {"index":2, "name":"C"},
}] {"index":3, "name":"E"},
} {"index":4, "name":"G"}
]
}

Metaweb actually returns the notes in the order we wanted! The fact that Metaweb inserts new
elements at the beginning of an ordered collection is an implementation detail, however, and is
not behavior that is guaranteed. In practice, if you want to define particular ordering for the ele-
ments of a collection, you must write a single query that enumerates each of those elements and
gives them all an index. Any subsequent insertions or shuffles require you to submit a new query
that again lists all elements and defines their order. 4

Here, therefore, is how we should have written the query above:

{
"type":"/user/docs/arpeggio",
"name":"broken CEG",
"note": [{
"create":"unless_exists",
"index":0,
"type":"/user/docs/note",
"name":"B"
},{
"create":"unless_exists",
"index":1,
"type":"/user/docs/note",
"name":"F"
},
{ "index":2, "type":"/user/docs/note", "name":"C" },
{ "index":3, "type":"/user/docs/note", "name":"E" },

4
What if, when we'd inserted B and F, we'd also included C, with index 2 in the query. Then Metaweb would know that B is less than
F and F is less than C. And it already knows that C is less than E and E is less than G. Wouldn't that define a complete ordering? This
sounds plausible, but, in fact, after such a query Metaweb would have an ordering for the subset BFC and another ordering for the
subset EG, and it still would not know the relationship between those two subsets.

120 Developing Metaweb-Enabled Web Applications


{ "index":4, "type":"/user/docs/note", "name":"G" }]
}

This query inserts the two new notes and re-iterates the order of the three existing ones, to specify
a complete ordering of five notes with indexes 0 through 4.

Metaweb's ordered collections are not arrays and do not behave like arrays. The requirement that
you re-specify the complete ordering even for simple insertions demonstrates this. And it also
makes it clear that ordering is only practical for relatively small and static collections. It makes
good sense to define an ordering for the tracks on an album, for example. It makes less sense to
define an ordering for the albums recorded by a band, however, because this set of albums may
change with time and a database maintainer would be required to respecify the complete disco-
graphy each time a new album was added. And it makes no sense at all to try to specify an ordering
for bands (by specifying an index for each instance property of the /music/artist type): there
are simply too many of them.

Arpeggios and Duplicate Notes

If you are a musician, you probably know that broken chords often repeat a note. In practice,
we'd want to represent arpeggios like EGCE, where the note E appears twice. In Metaweb,
the value of a property is a set, and sets do not allow duplicates, even when they are ordered.
That is, ordered collections in Metaweb are still sets, not lists, and they do not allow duplic-
ates. In order to represent an arpeggio EGCE, therefore, we'd have to create two separate
note objects named E. But having two objects that both represent the note E is problematic:
the next and chord properties of the Note type are premised on the assumption that there
will only be one Note instance for each note.

There are two lessons to be learned here:

• Ordered collections are still sets, and do not allow duplicates.

• Designing good Metaweb schemas for knowledge representation is hard to do.

5.1.11. Namespaces
In several places throughout this tutorial we've placed types into new namespaces simply to make
our queries a little easier to enter into the query editor. In this section we'll explore namespaces
in more detail.

We begin with a review of material from Chapter 2. First, remember that fully-qualified names
and namespaces don't have anything to do with the name property of an object. The name property
defines a human-readable display name for an object. Fully-qualified names are unique and can
be used as an alternative to the object guid.

Fully-qualified names are defined by the value type /type/key. Every object has a key property
that holds a set of /type/key values. If you want an object to have a fully-qualified name, insert
a key into its key property. The value property of the key specifies the object's unqualified or
local name. And the namespace property of the key specifies the object that defines the namespace.

Chapter 5. The MQL Write Grammar 121


Any object can be a namespace: the only requirement is that the object must itself have a key. In
this way we get a chain of /type/key/value properties that continues until we find a
/type/key/namespace property that refers to the special root namespace object.

The type /type/namespace exists, and defines the the property /type/namespace/keys, which
is the reciprocal of /type/key/namespace. Objects that are used as namespaces are usually given
the type /type/namespace, but this is not required.

The reason that namespaces are useful is that namespaces allow us to use fully-qualified names
to uniquely identify objects. If an object is given a key, then we can use its unique fully-qualified
name as the value of the id property. Identifying objects with a human-readable id is simpler
than using a long guid, and is more reliable than using the name and type properties together.

With that review of namespaces, let's try to put some of the note objects we've created into a
namespace. We'll use the /user/docs/note type object as our namespace:

Write Result
[{ [{
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"C", "name":"C",
"key":{ "key":{
"connect":"insert", "connect":"inserted",
"namespace":"/user/docs/note", "namespace":"/user/docs/note",
"value":"C" "value":"C"
} }
},{ },{
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"E", "name":"E",
"key":{ "key":{
"connect":"insert", "connect":"inserted",
"namespace":"/user/docs/note", "namespace":"/user/docs/note",
"value":"E" "value":"E"
} }
},{ },{
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"G", "name":"G",
"key":{ "key":{
"connect":"insert", "connect":"inserted",
"namespace":"/user/docs/note", "namespace":"/user/docs/note",
"value":"G" "value":"G"
} }
}] }]

This query gives the notes C, E, and G keys named "C", "E", and "G" within the namespace
/user/docs/note. That is, it defines fully-qualified names for these notes /user/docs/note/C,
/user/docs/note/E, and /user/docs/note/G. Now that these notes have unique ids, it becomes
(somewhat) easier to use them in queries. Here's how we might create a chord:

122 Developing Metaweb-Enabled Web Applications


Write Result
{ {
"create":"unless_exists", "create":"existed",
"type":"/user/docs/chord", "type":"/user/docs/chord",
"name":"CEG", "name":"CEG",
"note":[{ "note":[{
"connect":"insert", "connect":"unchanged",
"id":"/user/docs/note/C" "id":"/user/docs/note/C"
},{ },{
"connect":"insert", "connect":"unchanged",
"id":"/user/docs/note/E" "id":"/user/docs/note/E"
},{ },{
"connect":"insert", "connect":"unchanged",
"id":"/user/docs/note/G" "id":"/user/docs/note/G"
}] }]
} }

This query replaces the name and type properties of each note with a single id property. It doesn't
actually do anything, since we have already created the CEG chord. We've seen that we can use
a note's fully-qualified name as the value of its id property. What if we query the id of a note?

Read Result
{ {
"type":"/user/docs/chord", "type":"/user/docs/chord",
"name":"CEG", "name":"CEG",
"note":[{"id":null}] "note":[{"id":"#1f800000000104bd02"},
} {"id":"#1f800000000104beae"},
{"id":"#1f800000000104beec"}]
}

We get the guids of the notes rather than the fully-qualified names we've just defined. Core types,
such as /type/type and /type/property, that use id as their default property return a fully-
qualified name instead of their guid in queries like this.

5.1.11.1. The /type/namespace/keys Property


We've seen that we can put objects into a namespace by setting the key property of the object. It
is also possible to work with namespaces using the reciprocal property /type/namespace/keys.
We've been using /user/docs/note as a namespace. This next query asks what keys it holds:

Read Result
{ {
"id":"/user/docs/note", "id":"/user/docs/note",
"/type/namespace/keys":[] "/type/namespace/keys":[
} "next","chord","arpeggio","C","E","G"
]
}

Chapter 5. The MQL Write Grammar 123


The namespace holds the names of the three notes we added, and also the names of the three
properties we defined for the type. Let's repeat the query and ask for more detail:

Read Result
{ {
"id":"/user/docs/note", "id":"/user/docs/note",
"/type/namespace/keys":[{}] "/type/namespace/keys":[{
} "type":"/type/key",
"namespace":
"/user/docs/default_domain/note/next",
"value":"next"
},{
"type":"/type/key",
"namespace":
"/user/docs/default_domain/note/chord",
"value":"chord"
},{
"type":"/type/key",
"namespace":
"/user/docs/default_domain/note/arpeggio",
"value":"arpeggio"
},{
"type":"/type/key",
"namespace":
"/user/docs/default_domain/note/C",
"value":"C"
},{
"type":"/type/key",
"namespace":
"/user/docs/default_domain/note/E",
"value":"E"
},{
"type":"/type/key",
"namespace":
"/user/docs/default_domain/note/G",
"value":"G"
}]
}

The values of the /type/namespace/keys property are /type/key values that have value and
namespace properties. You'll notice that default_domain has crept back into the object ids in the
query results. This is interesting, but not terribly important. We'll investigate it later in this section.

There is one very important point to notice about these query results. When a key value is used
with /type/object/key, the namespace property is the id of the namespace object (such as
/user/docs/note) that holds the key. But when a key value is used with /type/namespace/keys,
the namespace property is the id of the object (such as /user/docs/note/C) contained by the
namespace. This is important to understand, so we'll state it another way: suppose that an object
o has a fully-qualified name in the namespace n. If we query the key property of o, we'll find a
/type/key object whose namespace property refers to n. And if we query the

124 Developing Metaweb-Enabled Web Applications


/type/namespace/keys property of n, we'll find a /type/key object whose namespace property
refers to o.

If you wanted to create a Metaweb namespace browser application, you could repeat the query
above, starting with the id of the root namespace "/". The namespace properties of each of the
returned keys specify the ids of all objects in the root namespace. If you recursively query each
of these ids, you'll find the complete set of Metaweb objects with fully-qualified names.

It is also possible to add objects to namespaces using the /type/namespace/keys property instead
of /type/object/key. The following query creates a new Note object named "G flat" and assigns
it the fully-qualified name /user/docs/note/G_flat:

Write Result
{ {
"id":"/user/docs/note", "id":"/user/docs/note",
"/type/namespace/keys":{ "/type/namespace/keys":{
"connect":"insert", "connect":"connected",
"value":"G_flat" "value":"G_flat",
"namespace":{ "namespace":{
"create":"unless_exists", "create":"created",
"name":"G flat", "name":"G flat",
"type":"/user/docs/note" "type":"/user/docs/note"
} }
} }
} }

5.1.11.2. Fully-Qualified Names and Uniqueness


In this section we explore the uniqueness of fully-qualified names. First, recall that earlier in the
tutorial we defined shortcut names for types. We've been using the name /user/docs/note for
a type that was originally defined as /user/docs/default_domain/note. If the type has two
fully-qualified names, and we're using that type as a namespace, then each of the notes we inserted
into that namespace should also have two names. /user/docs/default_domain/note/G should
be the same thing as /user/docs/note/G:

Read Result
{ {
"id":"/user/docs/default_domain/note/G", "id":"/user/docs/default_domain/note/G",
"/user/docs/note/chord":[] "/user/docs/note/chord":["CEG","BFG"]
} }

So a single object can have more than one fully-qualified name. But can a fully-qualified name
refer to more than one object? Let's try to give the note F the same key that we assigned to G:

{
"type":"/user/docs/note",
"name":"F",
"key":{
"connect":"insert",

Chapter 5. The MQL Write Grammar 125


"namespace":"/user/docs/note",
"value":"G"
}
}

This query fails, although there is nothing obviously wrong with it. Metaweb simply will not allow
the fully-qualified name /user/docs/note/G to refer to two different note objects. If you want
to make /user/docs/note/G refer to the note F, you must first make sure that the note does not
refer to the note G. This takes two queries. First, we must remove the fully-qualified name for
the note G:

Write Result
{ {
"id":"/user/docs/note/G", "id":"/user/docs/note/G",
"key":{ "key":{
"connect":"delete", "connect":"deleted",
"namespace":"/user/docs/note", "namespace":"/user/docs/note",
"value":"G" "value":"G"
} }
} }

And then we can assign that fully-qualified name to the note F:

Write Result
{ {
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"F", "name":"F",
"key":{ "key":{
"connect":"insert", "connect":"inserted",
"namespace":"/user/docs/note", "namespace":"/user/docs/note",
"value":"G" "value":"G"
} }
} }

Now if we were to ask for the name of the note /user/docs/note/G, we'd get "F". Making a
fully-qualified name refer to another object is simpler if we use the /type/namespace/keys
property instead. Here's how we could make /user/docs/note/G refer to the note G again:

Write Result
{ {
"id":"/user/docs/note", "id":"/user/docs/note",
"/type/namespace/keys": { "/type/namespace/keys":{
"value":"G", "value":"G",
"namespace":{ "namespace":{
"connect":"update", "connect":"updated",
"type":"/user/docs/note", "type":"/user/docs/note",
"name":"G" "name":"G"
} }

126 Developing Metaweb-Enabled Web Applications


Write Result
} }
} }

This query locates the /type/key object that defines the name /user/docs/note/G, and updates
the namespace property of that key, so that the name points to a different object. Note that you
should not typically have to alter namespaces like this. Objects that have fully-qualified names
should typically be constants.

Finally, notice that changing the object to which a fully-qualified name refers (as we did above)
is a completely different operation than changing the fully-qualified name of an object. If we
wanted to refer to the note G by the name /user/docs/note/Gnatural instead of
/user/docs/note/G, we could do this:

Write Result
{ {
"name":"G", "name":"G",
"type":"/user/docs/note", "type":"/user/docs/note",
"key":[{ "key":[{
"connect":"delete", "connect":"deleted",
"namespace":"/user/docs/note", "namespace":"/user/docs/note",
"value":"G" "value":"G"
},{ },{
"connect":"insert", "connect":"inserted",
"namespace":"/user/docs/note", "namespace":"/user/docs/note",
"value":"Gnatural" "value":"Gnatural"
}] }]
} }

5.1.12. Properties, Types, and Domains


In this final section of the tutorial, we dig a deeper into Metaweb internals. Creating types and
properties is almost always best done using the freebase.com client: there are a lot of details to
get right, and correct setup of types and properties is critical for correct functioning. In general,
it is better to be safe and let the client create types and properties for you.

On the other hand, types, properties, and domains are Metaweb objects just like any others, and
they can be created and manipulated with MQL write queries. There is some educational value
in seeing how this is done, and there are a few things that we can do with MQL that we cannot
do through the client.

5.1.12.1. Creating Self-Referential Reciprocal Properties


The first property we defined in this tutorial was /user/docs/note/next, and we used it to
model the cycle of fifths, so that the next property of the note C referred to G, and so on. We
never created a reciprocal property for next, but it seems logical that the note G should have a
previous property that refers to C.

Chapter 5. The MQL Write Grammar 127


At the time of this writing, the freebase.com client does not allow us to create reciprocal properties
where both ends of the link refer the same type, but we can do it with MQL:

Write Result
{ {
"id":"/user/docs/note", "id":"/user/docs/note",
"/type/type/properties":{ "/type/type/properties":{
"create":"unless_exists", "create":"created",
"type":"/type/property", "type":"/type/property",
"name":"Previous", "name":"Previous",
"key":{ "key":{
"connect":"insert", "namespace":"/user/docs/note",
"namespace":"/user/docs/note", "connect":"inserted",
"value":"previous" "value":"previous"
}, },
"expected_type": "expected_type":
"/user/docs/Note", "/user/docs/Note",
"unique":true, "unique":true,
"master_property": "master_property":
"/user/docs/note/next" "/user/docs/note/next"
} }
} }

The first line of the query identifies the type to which we're adding the property. The second,
third and fourth lines specify that we're creating a new /type/property object and connecting
it to the properties property of our type. The 5th line gives the property the human-readable
name "Previous" and the following four lines define a key so that the property has the fully-
qualified name /user/docs/note/previous. The expected_type property specifies that the
newly created property should link to other Note objects. The unique property specifies that each
Note can have only a single value for the previous property. And, finally, the master_property
property specifies that this new property is the reciprocal of /user/docs/note/next.

After executing this query, you can test it by querying the previous property of the note G. You
can also use the freebase.com client to browse the note objects you've created and follow their
next and previous properties back and forth.

5.1.12.2. Creating a Domain


Now that we've defined several music-related types, we may feel that they don't really belong
under either /user/docs/default_domain or /user/docs. Let's use MQL to create a new domain
and give it the fully-qualified name /user/docs/music. (At the time of this writing, there is no
way to create a custom domain with the freebase.com client.)

Domain objects have two properties: types specifies the set of types that are part of the domain.
owners specifies one or more usergroups that own the domain and have permission to create
types in it. Creating a domain is not enough: we must also set its ownership. We'll give this new
domain the same owner as /user/docs/default_domain. So before we create the domain, let's
find out the owner of the existing one:

128 Developing Metaweb-Enabled Web Applications


Read Result
{ {
"id":"/user/docs/default_domain", "id":"/user/docs/default_domain",
"/type/domain/owners":[], "/type/domain/owners":[
"type":[] "#1f80000000010499ee",
} ],
"type":["/type/domain"]
}

In addition to querying the owner of the domain, we also queried its types. Notice that it is not
co-typed with /type/namespace even though it is used as a namespace.

Now, we can create the new domain. Remember to use your own username in place of "docs",
and to substitute the id you obtained in the query above for the one shown here.

Write Result
{ {
"create":"unconditional", "create":"created",
"type":"/type/domain", "type":"/type/domain",
"owners":"#1f80000000010499ee", "owners":"#1f80000000010499ee",
"key":{ "key":{
"connect":"insert", "connect":"inserted",
"namespace":"/user/docs", "namespace":"/user/docs",
"value":"music" "value":"music"
} }
} }

Note that this is an example of a query in which "create":"unconditional" is actually required.


The constraints in this query match an existing domain object, and we need to force the creation
of a new domain. (Since the key sub-query has a connect directive, it is not a constraint.)

The only thing we can do with a domain is add types to it. That is the subject of the next section.

5.1.12.3. Creating Types


Let's add simple note and chord types to our newly created music domain. Creating Metaweb
types is tricky. There are a number of properties that must be set correctly, and in general, a type
cannot be created in a single query: there are multiple steps that must be followed to link everything
up correctly. Our approach with MQL follows the basic type creation strategy on the client: create
the type first, and then add properties to it. We'll create the note and chord type objects at the
same time:

Write Result
[{ [{
"create":"unless_exists", "create":"created",
"type":"/type/type", "type":"/type/type",
"name":"Note", "name":"Note",
"key":{ "key":{

Chapter 5. The MQL Write Grammar 129


Write Result
"connect":"insert", "connect":"inserted",
"namespace":"/user/docs/music", "namespace":"/user/docs/music",
"value":"note" "value":"note"
}, },
"domain":"/user/docs/music" "domain":"/user/docs/music"
},{ },{
"create":"unless_exists", "create":"created",
"type":"/type/type", "type":"/type/type",
"name":"Chord", "name":"Chord",
"key":{ "key":{
"connect":"insert", "connect":"inserted",
"namespace":"/user/docs/music", "namespace":"/user/docs/music",
"value":"chord" "value":"chord"
}, },
"domain":"/user/docs/music" "domain":"/user/docs/music"
}] }]

Note that these queries specify a fully-qualified name for the newly created type objects, but also
specify their domain. /user/docs/music serves as both the namespace and the domain for the
new types, and it is important that both are specified.

The next step is to add properties. To keep this example simple, we will add a note property to
the chord type, and then (in a separate query) add the reciprocal property to the note type. Here's
how we define the note property:

Write Result
{ {
"id":"/user/docs/music/chord", "id":"/user/docs/music/chord"
"/type/type/properties":{ "/type/type/properties":{
"create":"unless_connected", "create":"created",
"type":"/type/property", "type":"/type/property",
"name":"Notes", "name":"Notes",
"expected_type": "expected_type":
"/user/docs/music/note", "/user/docs/music/note",
"key":{ "key":{
"connect":"insert", "connect":"inserted",
"namespace": "namespace":
"/user/docs/music/chord", "/user/docs/music/chord",
"value":"note" "value":"note"
} }
} }
} }

Now we set up the reciprocal property in the note type:

130 Developing Metaweb-Enabled Web Applications


Write Result
{ {
"id":"/user/docs/music/note", "id":"/user/docs/music/note",
"/type/type/properties":{ "/type/type/properties":{
"create":"unless_connected", "create":"created",
"type":"/type/property", "type":"/type/property",
"name":"Chords", "name":"Chords",
"expected_type": "expected_type":
"/user/docs/music/chord", "/user/docs/music/chord",
"master_property": "master_property":
"/user/docs/music/chord/note", "/user/docs/music/chord/note",
"key":{ "key":{
"connect":"insert", "connect":"inserted",
"namespace": "namespace":
"/user/docs/music/note", "/user/docs/music/note",
"value":"chord" "value":"chord"
} }
} }
} }

With these types and properties defined, we can now create instances:

Write Result
{ {
"create":"unless_exists", "create":"created",
"type":"/user/docs/music/chord", "type":"/user/docs/music/chord",
"name":"CG", "name":"CG",
"note":[{ "note":[{
"create":"unless_exists", "create":"created",
"type":"/user/docs/music/note", "type":"/user/docs/music/note",
"name":"C" "name":"C"
},{ },{
"create":"unless_exists", "create":"created",
"type":"/user/docs/music/note", "type":"/user/docs/music/note",
"name":"G" "name":"G"
}] }]
} }

5.2. MQL Write Grammar


This section describes the MQL write grammar more formally. It uses the same notation as, and
borrows some definitions from MQL read grammar of Section 3.3.

A write-query is either a single-query or a query-list. A query-list is a comma-separated


list of single-queries within square brackets. And a single-query is a comma-separated list
of pairs within curly braces:

Chapter 5. The MQL Write Grammar 131


write-query:: single-query | query-list
query-list:: [ single-query ( , single-query )* ]
single-query:: { pair ( , pair )* }

A pair can be a property, a creation, a connection, an index-specification, or an id-query:

pair:: property |
creation |
connection |
index-specification |
id-query

A property pair is a property-name in quotation marks followed by a colon and a property-


value. A property-name is the same as it is in the MQL read grammar. See Section 3.3 for details.
A property-value is either a nested write-query, or a JSON string, a JSON number, or boolean
literal:

property:: " . property-name . " : property-value


property-value:: write-query | <JSON string> | <JSON number> | true | false

A creation directive is the keyword "create" in quotation marks followed by a colon and one of
the quoted strings "unconditional", "unless_exists" or "unless_connected".

creation:: "create" : creation-kind


creation-kind:: "unconditional" | "unless_exists" | "unless_connected"

A connection directive is similar to a creation directive. It consists of the keyword "connect"


in quotation marks followed by a colon and one of the strings "insert", "delete", or "update":

connection:: "connect" : connection-kind


connection-kind:: "insert" | "delete" | "update"

An index-specification is the keyword "index" in quotation marks, followed by a colon and


a non-negative integer:

index-specification:: "index" : <non-negative integer>

Finally, an id-query is simply the identifier "id" in quotation marks, followed by a colon and
the keyword null, without quotation marks. Alternatively, "/type/object/id" can be used in place
of "id"

id-query:: "id" : null |


"/type/object/id" : null

132 Developing Metaweb-Enabled Web Applications


Chapter 6. Metaweb Write Services
This chapter explains, and demonstrates with examples, how to deliver MQL write queries to the
Metaweb mqlwrite service. As a necessary prerequisite, it shows how to log in to Metaweb to
obtain authentication credentials, and how to present those credentials in subsequent writes. The
chapter also demonstrates how to upload content, such as images and HTML documents to the
Metaweb content store.

The examples in this chapter are all written in Python. They all communicate with the Metaweb
services at sandbox.freebase.com. And they are all designed to be run as command-line utilities.
Note that this differs from Chapter 4 which placed more emphasis on server-side code for web
applications. Many Metaweb-enabled web applications will be read-only, but the Python code
shown in this chapter should be straightforward to port for use in server-side scripts if you need
it.

6.1. Logging in to Metaweb


No registration or authentication is required to use the mqlread service, but writing to Metaweb
requires a log in first. Before we can cover the mqlwrite service, therefore, we must explain the
login service.

Since Metaweb services are HTTP-based, Metaweb authentication is cookie based. You log in
by making an HTTP POST request to the URL http://www.freebase.com/api/account/login,
passing your username and password as URL-encoded form parameters. If login is successful,
Metaweb returns one or more HTTP cookies in the response headers. These cookies contain your
authentication credentials, and you must pass these back to Metaweb in the HTTP request headers
of all subsequent write and upload requests.

The names and values of the authentication cookies are an implementation detail rather than a
specification detail, and are subject to change. To ensure success, your code must accept all
cookies returned by the login service, and must present all of them to the mqlwrite service. If you
write your applications using a suitably high-level HTTP library, cookie handling may be per-
formed automatically for you. For this chapter, however, we explicitly handle the cookies at a
lower level.

The cookies returned by the login service are persistent, which means that you do not have to
log into the freebase.com client each time you visit the site. Nevertheless, when writing scripts
that use Metaweb services, the best practice is to assume that your cookies have expired and log
in each time the script is invoked.

Example 6.1 shows Python code that defines a metaweb.login() utility function. This function
sends a username and password to the login service, and returns cookies that can be treated as
opaque authentication credentials. If login fails, the function raises a metaweb.MQLError exception.
This login utility function is part of a larger metaweb.py module. Other examples in this chapter
are also part of the module. We first saw metaweb.py in Chapter 4, where Example 4.3 defined
a metaweb.read() utility function. Example 4.3 also includes the definition of the MQLError ex-
ception class used in this example. Example 6.1 (and all our other Python examples) also depends
on the simplejson module, which is available from http://cheeseshop.python.org/pypi/simplejson.

133
Example 6.1. metaweb.py: logging in to Metaweb with Python

import httplib # Low-level HTTP networking


import urllib # URL encoding
import simplejson # JSON serialization and parsing

host = 'sandbox.freebase.com' # The Metaweb host


loginservice = '/api/account/login' # Path to login service

# Submit the specified username and password to the Metaweb login service.
# Return opaque authentication credentials on success.
# Raise MQLError on failure.
def login(username, password):
# Establish a connection to the server and make a request.
# Note that we use the low-level httplib library instead of urllib2.
# This allows us to manage cookies explicitly.
conn = httplib.HTTPConnection(host)
conn.request('POST', # POST the request
loginservice, # The URL path /api/account/login
# The body of the request: encoded username/password
urllib.urlencode({'username':username, 'password':password}),
# This header specifies how the body of the post is encoded.
{'Content-type': 'application/x-www-form-urlencoded'})

# Get the response from the server


response = conn.getresponse()

if response.status == 200: # We get HTTP 200 OK even if login fails


# Parse response body and raise a MQLError if login failed
body = simplejson.loads(response.read())
if body['status'] != '200 OK':
raise MQLError(body['messages'][0]['text'])

# Otherwise return cookies to serve as authentication credentials.


# The set-cookie header holds one or more cookie specifications,
# separated by commas. Each specification is a name, an equal
# sign, a value, and one or more trailing clauses that consist
# of a semicolon and some metadata. We don't care about the
# metadata. We just want to return a comma-separated list of
# name=value pairs.
cookies = response.getheader('set-cookie').split(',')
return ';'.join([c[0:c.index(';')] for c in cookies])
else: # This should never happen
raise MQLError('HTTP Error: %d %s' % (response.status,response.reason))

134 Developing Metaweb-Enabled Web Applications


6.1.1. The Login API
As you can see from Example 6.1, the login service expects URL-encoded username and password
parameters, in the body of an HTTP POST request to the path /api/account/login. The login service
always returns an HTTP status code of "200 OK", even when login fails. The body of the response
is a JSON formatted object, and the status property of this object specifies whether the login
succeeded or failed. If the status property is "200 OK", then login succeeded, and the response
will include a "set-cookie" header that contains authentication credentials.

If the login fails, then the status property of the JSON object will be something other then "200
OK". In this case, the messages property is an array (typically with just one element) of message
objects. The text property of the first element of this array includes an error message that describes
what went wrong. (Typically, this is "Invalid username or password".)

6.2. Making Write Queries


Chapter 5 demonstrated, in great detail, how to express write queries in MQL. Now we explain
how to send those queries to Metaweb and retrieve the result. The Metaweb write service is
mqlwrite. The path to this service is /api/service/mqlwrite, and it responds to HTTP POST
requests only. Wrap your query in an envelope object (described below) and submit the envelope
as the value of the q parameter in the body of the request. Also, include your authentication cre-
dentials in the HTTP Cookie header of the request. The mqlwrite service performs your query,
wraps the result in a response envelope object, and returns the JSON serialization of that object
as the body of its HTTP response.

X-Metaweb-Request

The Metaweb mqlwrite service (and also the upload service documented later in this chapter)
require a custom HTTP request header, named X-Metaweb-Request to be present in all re-
quests. The value of the header can be anything.

The requirement that the custom header be present is a security measure to prevent XSS
(cross-site scripting) attacks. It has a profound implication for the services that require it:
these services cannot be invoked via HTML form submission, since there is no way to tell
a web browser to add a custom header like this when POSTing a form.

The query and response envelopes are described in the sub-sections that follow, and those explan-
ations are followed by example code that performs writes.

6.2.1. The mqlwrite Query Envelope


The value of the q parameter should be the JSON serialization of an envelope object. This object
simply serves to give your query a name, by which it will be referred to in the response object.
Consider the following query:

{
"create":"unless_exists",

Chapter 6. Metaweb Write Services 135


"type":"/common/topic",
"name":"my test object"
}

To submit it to mqlwrite you would first envelop it like this:

{
"query":{
"create":"unless_exists",
"type":"/common/topic",
"name":"my test object"
}
}

The name "query" is arbitrary, but it will be reused in the response envelope.

6.2.2. The Response Envelope


mqlwrite is an HTTP service. It returns an HTTP response with a Content-Type header of ap-
plication/json. The body of the response is a JSON-serialized object. Only part of this object
is the result of the submitted query or queries: other properties provide meta-information. The
following are the properties of the response envelope:

result The value of this property is an object that holds the results of the submitted
queries. If the query envelope used the property name query for the query, then
the response envelope will make the results of the query available as result.query.
If a query fails, this property will be set to null.

status For successful queries, this property will have the value "200 OK".

queries The value of this property is a copy of the query envelope that was submitted.

messages This property is an array of error messages. For successful queries it is empty. For
unsuccessful queries, it contains one or more message objects. Each message object
has the following properties:

text A human-readable description of the error

args An object that provides additional details about the error. For example,
if a query uses "create":"unless_exists" and it cannot complete be-
cause two matching objects exist, then the text property might be "Need
a unique result to attach here, not 2", and the args object will specify
the guids of the two matching objects.

query A copy of the query object (not the query envelope, but the query itself)
that contains the error, with the addition of a special error_inside
property, to indicate where error occurs.

level A string that indicates the severity of the error. A typical value is "error".

136 Developing Metaweb-Enabled Web Applications


type A string that indicates the kind of error that occurred. A typical value is
"/service/error/type" to indicate a type error.

6.2.3. A mqlwrite Utility Function


Example 6.2 is another piece of our metaweb.py module. It defines a metaweb.write() function
that submits a MQL write query to the mqlwrite service, and returns the result. The query that
is submitted and the result that is returned are both Python dictionary objects. The utility function
takes case of JSON serialization and parsing and also seals the query in a query envelope and
opens the response envelope to extract the query result. Note that metaweb.write() expects au-
thentication credentials as its second argument. Use it with the metaweb.login() function in
code like this:

import metaweb # Use the metaweb module


query = {'create':'unless_exists', 'id':None, # This is our query
'type':'/common/topic', 'name':'my test object'}
credentials = metaweb.login("name", "pass") # Login first
result = metaweb.write(query, credentials) # Execute query
print "%s: %s" % (result['create'], result['id']) # Print result details

The code in Example 6.2 uses the metaweb.MQLError exception class. That class was defined in
Example 4.3.

Example 6.2. metaweb.py: sending a query to mqlwrite

import urllib # URL encoding


import urllib2 # Higher-level URL content fetching
import simplejson # JSON serialization and parsing

host = 'sandbox.freebase.com' # The Metaweb host


writeservice = '/api/service/mqlwrite' # Path to mqlwrite service

def write(query, credentials):


# We're requesting this URL
req = urllib2.Request('http://%s%s' % (host, writeservice))
# Send our authentication credentials as a cookie
req.add_header('Cookie', credentials)
# The body of the POST request is encoded URL parameters
req.add_header('Content-type', 'application/x-www-form-urlencoded')
# This custom header is required and guards against XSS attacks
req.add_header('X-Metaweb-Request', 'True')
# Wrap the query object in a query envelope
envelope = {'query': query}
# JSON encode the envelope
encoded = simplejson.dumps(envelope)
# Use the encoded envelope as the value of the q parameter in the body
# of the request. Specifying a body automatically makes this a POST.
req.add_data(urllib.urlencode({'q':encoded}))
try:
f = urllib2.urlopen(req) # Submit the request

Chapter 6. Metaweb Write Services 137


response = simplejson.load(f) # Parse HTTP response as JSON
if (response['status'] == '200 OK'): # Check for valid result
return response['result']['query'] # Open envelope, return result
else: # Otherwise raise an exception
raise MQLError(response['messages'][0]['text'])
except urllib2.HTTPError, e: # If anything goes wrong
if e.code == 400: # For code 400:
body = simplejson.load(e) # parse the response
raise MQLError(body['messages'][0]['text']) # to get an error message
else: # For any other error code, report the code and the response body
raise MQLError('HTTP Error %d:\n%s' % (e.code, e.read()))

6.2.4. Example: US State Quarters


Example 6.1 and Example 6.2 are helpful utility functions, but they are just utilities, not real-
world examples of how you might write to Metaweb. For the sake of example, let's suppose that
you are a coin collector and you think that freebase.com should include information about each
of the coins issued by the US Mint under its 50 State Quarters program. First, visit the US Mint's
website 1 to find out when each state quarter was released, how many were minted for each state,
and when each state became a state.

Next, create a Metaweb type to model this data. Login to the freebase.com client and create a
new type named "US State Quarter". (Review the procedure for creating types and adding prop-
erties in Chapter 5, if necessary.) If your Freebase username is "fred", this will create a type with
id /user/fred/default_domain/us_state_quarter.

Next, give your type four properties:

• A property named "State", of /type/text to specify the name of the state. (The freebase.com
client specifies types by name rather than by id, so use the type name "Text" in the client).

• A property named "Release", of /type/datetime, to specify the date on which the quarter was
released into circulation. (Use the type name "Date/Time").

• A property named "Mintage", of /type/int, to specify how many quarters were minted. (Use
the type name "Integer".)

• A property named "Statehood", of /type/datetime to specify when the state gained statehood.

Be sure to make each of these properties unique by clicking the "Restrict to one value" checkbox.

Next, you need to get your data into manageable form. Extract data from the US Mint site, and
arrange it in a plain text file named quarters.txt that looks like the following:

Delaware,1999-01-04,1787-12-07,774824000
Pennsylvania,1999-03-08,1787-12-12,707332000
New Jersey,1999-05-17,1787-12-18,662228000
Georgia,1999-07-19,1788-01-02,939932000

1
Specifically the page http://www.usmint.gov/mint_programs/50sq_program/index.cfm?action=schedule

138 Developing Metaweb-Enabled Web Applications


Connecticut,1999-10-12,1788-01-09,1346624000
Massachusetts,2000-01-03,1788-02-06,1163784000

Each line in this file is the data for a single quarter. Fields are separated by commas. The first
field is the name of the state. The second and third fields are the release date and statehood date
for that state. And the fourth field is the mintage for that state quarter.

With our type created, and the data in this format, we can now write a simple script to upload
the data to freebase.com. Example 6.3 shows Python code to do this. Note that you need to insert
your own Freebase username and password into the script to make it work for you.

Example 6.3. quarters.py: writing a data set to Metaweb

import metaweb # Use the metaweb module

USERNAME = 'username' # Put your Freebase username and password here


PASSWORD = 'password'

# The ID for our US State Quarter type depends on our username


TYPEID = '/user/' + USERNAME + '/default_domain/us_state_quarter'

# Make sure we can log in before we go any further


credentials = metaweb.login(USERNAME, PASSWORD)

# We will be creating multiple quarters in a single MQL query.


# We start with an empty array and add MQL writes to it in the loop below
query = []

# Open our file of quarter data


f = open("quarters.txt", "r")

# Loop through lines of the file


for line in f:
# Break each line into fields
fields = line.strip().split(',')

# This query creates a single quarter


q = {'create':'unless_exists', # Create a new object
'id':None, # And return its id
'type':["/common/topic", TYPEID], # Make it a topic and a quarter
'name': fields[0] + ' State Quarter', # The object's name
'state': fields[0], # State name
'release': fields[1], # Release date
'statehood': fields[2], # Statehood date
'mintage': int(fields[3])} # How many minted

# Add this write to the array of writes


query.append(q);

# Close the data file


f.close()

Chapter 6. Metaweb Write Services 139


# Now send our one big query to Metaweb and get the result
result = metaweb.write(query, credentials)

# Display the id of the Metaweb object for each state


for r in result:
print "%s %s: %s" % (r['create'], r['state'], r['id'])

6.2.5. Sending Multiple Queries to mqlwrite


As we saw in Chapter 5, the MQL write grammar allows multiple independent writes to be spe-
cified in a single query as elements of a JSON array. The mqlwrite service also allows multiple
named queries to be placed in an envelope. Suppose we want to create two objects in a single
call to mqlwrite. Here are two envelopes that can accomplish that:

2 Writes in 1 Query 2 Queries in 1 Envelope


{ {
"query":[{ "query1":{
"create":"unless_exists", "create":"unless_exists",
"type":"/common/topic", "type":"/common/topic",
"name":"my test object #1" "name":"my test object #3"
},{ },
"create":"unless_exists", "query2":{
"type":"/common/topic", "create":"unless_exists",
"name":"my test object #2" "type":"/common/topic",
}] "name":"my test object #4"
} }
}

When you include multiple writes in a single query, the writes are executed atomically: they all
succeed or they all fail. As a result, they are not allowed to depend on each other, and there is no
way to tell what order they are executed in.

If you submit multiple queries in a single envelope, they are not atomic. Each one succeeds or
fails on its own. JSON envelopes are unordered collections of properties, and there is no guarantee
that the queries will be executed in the order in which they are written. For this reason, queries
that share an envelope should not depend on each other.

The fact that mqlwrite can accept multiple queries explains why an envelope object is required
in the first place. It is needed so that each query has a name that can be used for query results. If
multiple named queries are submitted in an envelope, then the response envelope contains multiples
result. If the queries are named q1 and q2, then the responses are available as result.q1 and
result.q2.

6.3. Uploading Data to Metaweb


The unique feature of Metaweb is the way that it stores relationships between objects. But, like
any database, it can also store large chunks of data, such as long HTML documents or binary

140 Developing Metaweb-Enabled Web Applications


image files. In Chapter 4 we learned about the trans service for retrieving content. Here, we'll
learn how to upload content to be stored in Metaweb. The service for uploading content is named
upload, and has the URL path /api/service/upload. The upload service responds only to HTTP
POST requests. The data to be uploaded is passed in the body of the request. The MIME type of
the content, as well as the encoding of textual content is specified in the HTTP Content-Type
request header.

6.3.1. An Upload Utility


Example 6.4 shows the Python implementation of a metaweb.upload() utility. Pass it a string of
content (Python strings can be binary data or textual data), a MIME type for the content, and the
authentication credentials returned when you logged in. upload() creates a new /type/content
object to hold your content and returns the guid of that object to you. This guid can be used to
retrieve the content with the /api/trans/raw service (see Chapter 4).

No Duplicates

The upload service does not always create a new /type/content object for the content you
upload. All content is checksummed when it is uploaded, and these checksums are used to
detect duplicate uploads. If the content you are uploading already exists in the Metaweb
content store, the existing /type/content object is used. The blob_id property of
/type/content holds the checksum value.

Example 6.4. metaweb.py: uploading content to Metaweb

import urllib2 # Higher-level URL content fetching


import simplejson # JSON serialization and parsing

host = 'sandbox.freebase.com' # The Metaweb host


uploadservice = '/api/service/upload' # Path to upload service

# Upload the specified content (and give it the specified type).


# Return the guid of the /type/content object that represents it.
# The returned guid can be used to retrieve the content with /api/trans/raw.
def upload(content, type, credentials):
# This is the URL we POST content to
url = 'http://%s%s'%(host,uploadservice)
# Build the HTTP request
req = urllib2.Request(url, content) # URL and content to POST
req.add_header('Content-Type', type) # Content type header
req.add_header('Cookie', credentials) # Authentication header
req.add_header('X-Metaweb-Request', 'True') # Guard against XSS attacks
try:
f = urllib2.urlopen(req) # POST the request
response = simplejson.load(f) # Parse the response
return response['result']['id'] # Extract and return content id
except urllib2.HTTPError, e:
if e.code == 400: # For code 400:
body = simplejson.load(e) # parse the response

Chapter 6. Metaweb Write Services 141


raise MQLError(body['messages'][0]['text']) # to get error message
else: # For any other error, report code and response body
raise MQLError('HTTP Error %d:\n%s' % (e.code, e.read()))

6.3.2. Examples: Uploading Images of State Quarters


In order to demonstrate the metaweb.upload() utility of Example 6.4 let's continue with our US
State Quarter example. The US Mint has images of each of the state quarters on its website. Let's
upload those images to freebase.com, and make them visible through the /common/topic/image
property of each quarter object. Example 6.5 shows how to do this.

In order to understand this code, you have to know that the upload service handles images spe-
cially. When you upload images, the /type/content object is also given the type /common/image.
Furthermore, the upload service determines the size of the image and creates an appropriate
/measurement_unit/rectangle_size object for the /common/image/size property. Because
your uploaded /type/content is co-typed as /common/image, you can link directly to image
content from /common/topic/image.

Example 6.5. quarterpix.py: uploading images to Metaweb

import metaweb # Our metaweb utilities


import urllib2 # For downloading images from the mint server

USERNAME = 'username' # Put your Freebase username and password here


PASSWORD = 'password'

# The ID for our US State Quarter type depends on our username


TYPEID = '/user/' + USERNAME + '/default_domain/us_state_quarter'

# Make sure we can log in before we go any further


credentials = metaweb.login(USERNAME, PASSWORD)

# All the images files are beneath this URL


imagedir = 'http://www.usmint.gov/images/mint_programs/50sq_program/states/'

# This dictionary maps state name to image name.


images = { 'Delaware': 'DE_winner.gif',
'Pennsylvania': 'PA_winner.gif',
'New Jersey': 'NJ_winner.gif',
'Georgia': 'GA_winner.gif',
'Connecticut': 'CT_winner.gif',
'Massachusetts': 'MA_winner.gif'}

# Loop through the states


for state,filename in images.items():
# First, download the image from the Mint's website
image = urllib2.urlopen(imagedir + filename)
type = image.info()['Content-Type']
content = image.read()

142 Developing Metaweb-Enabled Web Applications


# Now upload it to Metaweb
id = metaweb.upload(content, type, credentials)

# Define a write query to link the quarter object to the uploaded image
query = { 'type': TYPEID,
'state':state,
'/common/topic/image': { 'id':id, 'connect':'insert' }}

# Submit the query and get the result


result = metaweb.write(query, credentials)

# Output the result


print "%s: %s %s" %(state, result['/common/topic/image']['connect'], id)

Once you have run the code in Example 6.5, use the freebase.com client to view your state quarter
topics. You'll see that there are now images on the page.

6.3.3. Uploading Documents


It is fairly easy to upload an image and make it visible to users of freebase.com. It is a little
trickier to do the same for textual content. When you upload a document, a /type/content object
is created for that document. This allows the content to be retrieved with /api/trans/raw, but it
doesn't allow it to be viewed in any natural way in the client. To accomplish that, you must create
a /common/document object to reference the content, and a /common/topic object to reference
the document. Example 6.6 shows how you can do this.

Example 6.6. uploaddoc.py: uploading HTML documents to Metaweb

import sys, re, metaweb

# Read the content of the file specified on the command line


# It must be an HTML file with a <title>
filename = sys.argv[1]
try:
f = open(filename)
doc = f.read()
f.close()
except Exception, e:
sys.exit(e)

# Search through the document for a title


try: title = re.search("(?i)<title>(.*)</title>", doc).group(1)
except: sys.exit("Document has no title")

# Log in to Metaweb. Put your own username and password here


credentials = metaweb.login('username','password')

# Upload the document content to Metaweb.


# Note that we hardcode the text/html content type.
content_id = metaweb.upload(doc, "text/html", credentials)

Chapter 6. Metaweb Write Services 143


# Submit a MQL write query to create a /common/topic and /common/document
# for the uploaded content
result = metaweb.write({'create':'unless_exists',
'type':'/common/topic',
'id':None,
'name':title,
'article' : { 'create':'unless_exists',
'type':'/common/document',
'id':None,
'content':content_id }},
credentials)

# Tell the user what we did


print "Uploaded %s: %s\n\tcontent: %s\n\tdocument: %s %s\n\ttopic: %s %s" % (
filename, title, content_id,
result['article']['create'], result['article']['id'],
result['create'], result['id'])

This Python program expects the name of an HTML file as a command-line argument. It reads
the file and determines the document title by searching for a <title> tag. It uploads the document
text with the metaweb.upload() method of Example 6.4. Then it submits a MQL write query to
create a /common/topic that refers to a /common/document that refers to the uploaded content. It
uses the document title as the name of /common/topic object.

6.4. Example: Unlinking Objects


As discussed in Chapter 5, Metaweb does not allow objects to be deleted. The closest we can
come to deleting an object is deleting all the links between that objects and others. When important
links such as the object's name and type are deleted, the object effectively ceases to have any
useful identity. Example 6.7 is an object unlinking utility, which attempts to unlink all objects
you have created (or a subset with the specified type and/or name) in the database. For each object
to be deleted, it first queries the set of types of the object. Then it queries the value of all properties
defined by all of those types. It turns the results of this read query into a write query, by adding
a "connect":"delete" directive to each property. Finally, it executes this write query to unlink
all properties of the object.

Example 6.7 performs a number of read queries as well as writes. To do this, it uses the
metaweb.read() utility defined in Example 4.3 of Chapter 4.

Example 6.7 is a relatively complex example. It demonstrates the mqlwrite service, of course,
but also demonstrates low-level reads and writes of types and properties. If you can make sense
of this example, you have a solid understanding of the Metaweb architecture.

Example 6.7. unlink.py: unlinking Metaweb objects

import sys, getopt # Command-line parsing


import metaweb # login, read, and write utilities

# A cache that maps type ids to arrays of property information

144 Developing Metaweb-Enabled Web Applications


typecache = {}

# Primitive types other than /type/text and /type/key, which are


# handled specially
primitives = set(["/type/boolean", "/type/datetime", "/type/float",
"/type/id", "/type/int", "/type/rawstring", "/type/uri"])

# Return information about the properties of type t.


# Use a cache to avoid repeated queries
def getPropertiesOfType(t):
if (t not in typecache):
props = metaweb.read({'type':'/type/type',
'id':t,
'properties':[{'id':None,
'expected_type':None}]})
typecache[t] = props['properties']

return typecache[t]

#
# Return a MQL write query that will unlink the object with the specified id
#
def makeUnlinkQuery(id):
# Find all types of the object
types = metaweb.read({ 'id':id, 'type':[] })['type']

# Make a list of all properties of all types of the object


props = []
for t in types:
props.extend(getPropertiesOfType(t))

# We start off with queries for the well-known properties of /type/object


q = { 'id': id,
'name': [{'value':None, 'lang':None, 'optional':True}],
'type': [{'id':None, 'optional':True}],
'key': [{'value':None, 'namespace':None, 'optional':True}]}

# Add properties to this query to ask for the id or value of all


# other properties of all types. Note that the way we ask for the value
# of a property depends on the expected type of the property
for p in props:
propid = p['id']
proptype = p['expected_type']
# We now need to add propid to the query.
# The value of the property depends on its expected type.
if proptype == '/type/text':
q[propid] = [{ 'value':None, 'lang':None, 'optional':True }]
elif proptype == '/type/key':
q[propid] = [{ 'value':None, 'namespace':None, 'optional':True }]
elif proptype in primitives:
q[propid] = [{ 'value':None, 'optional':True }]

Chapter 6. Metaweb Write Services 145


else:
q[propid] = [{ 'id': None, 'optional':True }]

# Get the result of this query


r = metaweb.read(q)

# The query is structured in such a way that the result is almost ready
# to be reused as a write query. We loop through the properties of the
# result, and for any that are non-empty, we copy them to a query object
# adding "connect":"delete" to transform it into a MQL write query
q = {}
for p,v in r.iteritems():
if p == 'id': # leave the id of the query alone
q[p] = v
continue
# if the property is not id, then the value is an array
# if the array is empty, then skip this property; don't copy to query
if len(v) == 0: continue
# otherwise, iterate through the elements of the array
# and add the connect:delete directive to each one
for elt in v: elt['connect'] = 'delete'
# and copy to the query
q[p] = v

# Return the MQL write query that we can use to unlink the object
return q

def main():
# Use the getopt module to parse the command line arguments
try:
opts, args = getopt.getopt(sys.argv[1:],
"u:p:n:t:",
["user=","password=","name=","type=","debug"])
except getopt.error, msg:
print msg
sys.exit(0)

# Now process each of the parsed arguments


username, password, name, type, debug = (None, None, None, None, None)
for o,a in opts:
if (o in ('-u', '--user')): username = a
elif (o in ('-p', '--password')): password = a
elif (o in ('-n', '--name')): name = a
elif (o in ('-t', '--type')): type = a
elif (o == '--debug'): debug = True

if username is None or password is None:


sys.exit("username and password must be specified")

146 Developing Metaweb-Enabled Web Applications


# Use the username and password to login to metaweb
credentials = metaweb.login(username, password)

# Start building a query to find objects to be unlinked


# We will only attempt to unlink objects that we created ourselves.
# By default we will unlink all objects we've created
query = [{'creator': '/user/' + username,
'id': None }]

# If the name argument was specified, then we only want to unlink


# objects with the specified name
if (name is not None):
query[0]['name'] = name

# If the type argument was specified then we only want to unlink


# objects with the specified type
if (type is not None):
query[0]['type'] = type

# Run the query to get the objects we're going to delete


objects = metaweb.read(query)

# If we didn't find any, quit now


if len(objects) == 0:
sys.exit("Nothing to unlink")

# Now unlink each of the objects we found


for o in objects:
id = o['id']
print "Unlinking: %s" % id
# build the query
q = makeUnlinkQuery(id)
if (debug): print "Unlink query: " + repr(q)
# run the query
result = metaweb.write(q, credentials)
if (debug): print "Unlink result: " + repr(result)

# If we're executing this file from the command-line, run the main() method
if __name__ == "__main__": main()

Chapter 6. Metaweb Write Services 147


148
Appendix A. Additional Code
This appendix collects interesting or useful examples that are too long, or would be too much of
a digression to appear in the elsewhere in this manual.

A.1. json.js
Example A.1 is the json.js module that defines the JSON.parse() and JSON.serialize()
functions used in the JavaScript-based examples of Chapter 4.

Example A.1. json.js: JSON parsing and serialization in JavaScript

/**
* json.js:
* This file defines functions JSON.parse() and JSON.serialize()
* for decoding and encoding JavaScript objects and arrays from and to
* application/json format.
*
* The JSON.parse() function is a safe parser: it uses eval() for
* efficiency but first ensures that its argument contains only legal
* JSON literals rather than unrestricted JavaScript code.
*
* This code is derived from the code at http://www.json.org/json.js
* which was written and placed in the public domain by Douglas Crockford.
**/
// This object holds our parse and serialize functions
var JSON = {};

// The parse function is short but the validation code is complex.


// See http://www.ietf.org/rfc/rfc4627.txt
JSON.parse = function(s) {
try {
return !(/[^,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]/.test(
s.replace(/"(\\.|[^"\\])*"/g, ''))) &&
eval('(' + s + ')');
}
catch (e) {
return false;
}
};

// Our JSON.serialize() function requires a number of helper functions.


// They are all defined within this anonymous function so that they remain
// private and do not pollute the global namespace.
(function () {
var m = { // A character conversion map
'\b': '\\b', '\t': '\\t', '\n': '\\n', '\f': '\\f',
'\r': '\\r', '"' : '\\"', '\\': '\\\\'
},

149
s = { // Map type names to functions for serializing those types
'boolean': function (x) { return String(x); },
'null': function (x) { return "null"; },
number: function (x) { return isFinite(x) ? String(x) : 'null'; },
string: function (x) {
if (/["\\\x00-\x1f]/.test(x)) {
x = x.replace(/([\x00-\x1f\\"])/g, function(a, b) {
var c = m[b];
if (c) {
return c;
}
c = b.charCodeAt();
return '\\u00' +
Math.floor(c / 16).toString(16) +
(c % 16).toString(16);
});
}
return '"' + x + '"';
},
array: function (x) {
var a = ['['], b, f, i, l = x.length, v;
for (i = 0; i < l; i += 1) {
v = x[i];
f = s[typeof v];
if (f) {
v = f(v);
if (typeof v == 'string') {
if (b) {
a[a.length] = ',';
}
a[a.length] = v;
b = true;
}
}
}
a[a.length] = ']';
return a.join('');
},
object: function (x) {
if (x) {
if (x instanceof Array) {
return s.array(x);
}
var a = ['{'], b, f, i, v;
for (i in x) {
v = x[i];
f = s[typeof v];
if (f) {
v = f(v);
if (typeof v == 'string') {
if (b) {

150 Developing Metaweb-Enabled Web Applications


a[a.length] = ',';
}
a.push(s.string(i), ':', v);
b = true;
}
}
}
a[a.length] = '}';
return a.join('');
}
return 'null';
}
};

// Export our serialize function outside of this anonymous function


JSON.serialize = function(o) { return s.object(o); };
})(); // Invoke the anonymous function once to define JSON.serialize()

A.2. Client-side MQL Queries through a


Proxy
Chapter 4 includes a JavaScript Metaweb.read() utility function whose script-based implement-
ation is shown in Example 4.8. In this section we present a proxy-based implementation of the
same function.

The proxy-based implementation of Example A.2 behaves identically to the script-based imple-
mentation in Example 4.8. We could convert the test application of Example 4.7 to use this new
implementation simply by changing this line:

<script src="metaweb.js"></script> <!-- defines Metaweb.read() -->

to this:

<script src="metaweb_proxy.js"></script> <!-- defines Metaweb.read() -->

This implementation of Metaweb.read() uses an XMLHttpRequest object to submit the HTTP


request to the proxy service. (XMLHttpRequest allows HTTP requests to be scripted with JavaScript.
The API for this object is not described here: it is a common feature of all Ajax-based web ap-
plication frameworks and should be documented in any modern JavaScript reference.) It uses the
JSON.serialize() function to convert the query object to a string, and uses JSON.parse() to
convert the response text back into object form. The implementation of these JSON functions is
in Example A.1.

Example A.2. metaweb_proxy.js: Metaweb queries through a proxy

/**
* metaweb_proxy.js:
*
* This file implements a Metaweb.read() utility function using XMLHttpRequest

Appendix A. Additional Code 151


* and a server-side proxy script named mqlread.php
* For simplicity, this code requires a native XMLHttpRequest object,
* which means that it does not work in Internet Explorer prior to IE7.
**/
var Metaweb = {}; // Define our namespace
Metaweb.QUERY_PROXY = "mqlread.php"; // The relative URL of the proxy service

// Send query q to Metaweb, and pass the result to function f.


// If hasEnvelope is omitted or false, then this function wraps an
// envelope around the query.
Metaweb.read = function(q, f) {
// Put the query in inner and outer envelopes
envelope = {qname: {query: q}}
// Serialize the envelope to a JSON string
var serialized = JSON.serialize(envelope);
// URL encode the serialized query
var encoded = encodeURIComponent(serialized);
// Build the query URL
var url = Metaweb.QUERY_PROXY + "?queries=" + encoded

// Use XMLHttpRequest to submit the request to the proxy


var request = new XMLHttpRequest();
// When the response arrives, call this function
request.onreadystatechange = function() {
// If the request is done and was successful
if (request.readyState == 4 && request.status == 200) {
// Parse the JSON text of response to an envelope object
var outerEnvelope = JSON.parse(request.responseText);
// Get inner envelope from outer envelope
var innerEnvelope = outerEnvelope.qname;
// Make sure the query was successful
if (innerEnvelope.status == "/mql/status/ok") {
// Take the result object out of the response envelope
var result = innerEnvelope.result;
// And pass that object to the user's function
f(result);
}
}
}
// Now send the request to the proxy
request.open("GET", url);
request.send(null);
};

This Metaweb.read() implementation is not complete without the mqlread.php proxy script that
it relies on. Example A.3 shows a very simple implementation of such a proxy: it is a PHP script
that simply forwards the URL parameter query to the mqlread service at Metaweb and relies on
the default behavior of the PHP curl_exec() function which directs the response from the for-
warded query to the output stream of the script. Notice that this trivially simple script is not a
fully adequate proxy. For a production web application, you would want to use a fully-fleshed
out proxy.

152 Developing Metaweb-Enabled Web Applications


Example A.3. mqlread.php: a trivial mqlread proxy in PHP

<?php
$q = str_replace("\\\"", "\"", $_GET["queries"]);
$url = "http://www.freebase.com/api/service/mqlread?queries=" . urlencode($q);
$request = curl_init($url);
curl_setopt($request, CURLOPT_COOKIE, "###freebase.com cookie data here###");
curl_exec($request);
curl_close($request);
?>

A.3. Example: Auto-completion with mqlread


This section presents a JavaScript example that demonstrates how to use mqlread to perform
auto-completion and validation of form input. For the sake of code brevity, the auto-completion
demonstrated here is a fairly simple kind: if there is a single possible completion for the user's
input, the code below will complete that input. It is also possible, though not demonstrated here,
to use a dropdown menu to list all possible completions of the user's input. (See, for example,
http://developer.yahoo.com/yui/autocomplete/ for a JavaScript UI that allows the user to choose
among multiple completions). Metaweb servers also support a specialized (and optimized) auto-
completion service, which is used by the freebase.com client but is not documented here.

The auto-completion and validation code is in a function called Metaweb.addValidationAndCom-


pletion(). Before showing the implementation of this function, however, we'll show how it is
used. Example A.4 is a simple HTML page that displays two HTML input fields. One field expects
the name of a country, and the other the name of a rock band. The page includes a script that
calls Metaweb.addValidationAndCompletion() to set up appropriate event handlers for those
fields. The page also includes a CSS stylesheet that defines styles for error messages and back-
ground colors for input fields that contain invalid or incomplete input.

Example A.4. completiontest.html: using Metaweb.addValidationAndCompletion()

<script src="json.js"></script>
<script src="metaweb.js"></script>
<script src="validateAndComplete.js"></script>
<script>
window.onload = function() {
// When the document loads, set up autocompletion for our text fields
Metaweb.addValidationAndCompletion(document.getElementById("country"),
"/location/country",
document.getElementById("countrymsg"));
// Note that we add an additional constraint here
Metaweb.addValidationAndCompletion(document.getElementById("band"),
"/music/artist",
document.getElementById("bandmessage")
/*{genre:"Rock"}*/);
}
</script>
<style>
/* Styles for incomplete and invalid input */

Appendix A. Additional Code 153


input.invalid { background-color: #f88; } /* invalid input is red */
input.incomplete {background-color: #ff8; } /* incomplete input is yellow */
span.message { font-style: italic; }
</style>
<body>
<!-- These are the input fields that do validation and completion -->
Enter a country name (or unique prefix):
<input id="country"><span id="countrymsg" class="message"></span>
<br>
Enter the name of a rock band (or unique prefix):
<input id="band"><span id="bandmessage" class="message"></span>
</body>

Example A.5 is the implementation of Metaweb.addValidationAndCompletion(). It begins with


a long comment that explains how it works and how it is used. Note that the MQL query used in
this code uses the ~= pattern-matching operator, with ^ and * to find objects in the graph whose
name begins with the user's input. If this query returns no results, then the user's input is invalid.
If the query returns exactly one result, then it can be used for auto-completion. And if the query
returns more than one result (none of which match the user's input exactly) then the input cannot
be completed. If the user's input is invalid or incomplete, the function sets the CSS class of the
text field to give the user feedback, and also optionally displays an error message.

Example A.5. validateAndComplete.js: form validation and completion with mqlread

/**
* Add an onchange event handler to the specified textfield object
* to validate the user's input and autocomplete it if necessary.
* The type argument is the Metaweb type, such as "/music/artist"
* for which autocompletion should be done.
*
* The optional message argument specifies a document element into
* which error messages (such as "invalid input" or "incomplete input"
* should be displayed) The optional constraints argument is an object
* that contains additional MQL properties that should be added to the
* query. This can be use to further constrain the autocompletion
* beyond simple type-based autocompletion.
*
* If the callback handler determines that the user's input is invalid or
* incomplete, it sets the cssClass property to "invalid" or "incomplete"
* (overwriting any other class values specified by that property).
* You can define these CSS classes to set background colors or otherwise
* highlight fields that require the user's attention.
*
* Validity and autocompletion is done using the ~= pattern matching
* operator to ask for results of the specified type that begin with
* the specified string. The query requests that results are sorted by
* name and that only the first two are returned. If no results are
* returned, this means that the user's input is invalid. If exactly
* one result is returned, then the user's input is unique and
* autocompletion is performed. If two results are returned, then the
* user's input is not unique, but may still be valid, if the input

154 Developing Metaweb-Enabled Web Applications


* matches the first result (because the results are sorted), and that
* result is a prefix of the second. If two results are returned and
* the first result is not the same as the user's input, then the
* input is incomplete.
*
* This function sets the onchange and class attributes of the text
* field element and should not be used with HTML elements that set
* these attributes themselves. Also, this function alters the
* visibility and content of the optional message element.
*/
Metaweb.addValidationAndCompletion = function(textfield, type,
message, constraints)
{
// Ensure that the message element, if any, is hidden
if (message) message.style.visibility = "hidden";

// And add an event handler to the text field.


textfield.onchange = function() {
// Get the user's input and convert to lowercase.
// Metaweb does case-insensitive pattern matching.
var input = this.value.toLowerCase();

// This is the MQL query we use for autocompletion


var query = [{
type: type, // Find objects of this type
name: null, // We want to know object name
"name~=": "^" + input + "*", // ^ and * make this a prefix match
sort: "name", // Shortest completions first
limit: 2 // We only care about the first 2
}];

// Add any additional constraints to the query


if (constraints) for (c in constraints) query[0][c] = constraints[c];

// Now submit the query and pass the results to the nested function
Metaweb.read(query, function(results) {
// If there are no results, input is invalid
if (results.length == 0) {
// Set invalid class and display invalid message
textfield.className = "invalid";
if (message) {
message.innerHTML = "invalid input";
message.style.visibility = "visible";
}
}
// If there is one result or if the first result
// matches exactly, input is valid.
else if (results.length == 1 ||
results[0].name.toLowerCase() == input) {
// Autocomplete the value.
// Use capitalization Metaweb returns to us

Appendix A. Additional Code 155


textfield.value = results[0].name;
// Clear class and message
textfield.className = "";
if (message) {
message.innerHTML = "";
message.style.visibility = "hidden";
}
}
// If there is more than one result and no match
// then the user's input is incomplete
else {
// Set incomplete class and message
textfield.className = "incomplete";
if (message) {
message.innerHTML="incomplete input";
message.style.visibility = "visible";
}
}
});
}
}

156 Developing Metaweb-Enabled Web Applications

You might also like