0% found this document useful (0 votes)
119 views42 pages

REST at Amazon v1

Amazon.com REST design/development practices

Uploaded by

Cois
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
119 views42 pages

REST at Amazon v1

Amazon.com REST design/development practices

Uploaded by

Cois
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 42

REST@Resource-Driven Distributed

Computing Interfaces for Amazon v1


eCommerce Platform Recommendation 26915 th FebruaryAprilMay, 2009

WORKING GROUP ADVISOR :


Chris Suver
E DITOR:
Brad Porter
WORKING GROUP MEMBERS:
Geoff Arnold, Josh Curry, Barry Feigenbaum, Joe Ellsworth, Jeromey Goetz, Mark Hjelm, Jamie Hunter, Jesper
Johansson, Don Johnson, Brian O’Neill, Madhu Parthasarathy, Korwin Smith, Alex Yiu

1 Abstract

RESTful Resource-oriented distributed system interactions are built around interactions offer improvements over
traditional RPC-based service interaction s. It does this by focusing on the data artifacts in the system, or
“resources”, rather than the functional behaviors. Open-content systems presume that all data in those resources
is significant to someone and therefore must be preserved even in the event that only a subset of the content is
necessary for the current interaction. By sharing adefining common set of operations over resources, the
functional semantics of the interfaces can be fixed. By combining those common operations on resources with an
open-content-based resource model, while the semantic expressiveness of the interfaces can be extended as
necessary. This applies to traditional data oriented resources and also to workflow operations.

This document specifies a complete model for open-content resource-oriented service definition using HTTP, SDL
and ION. Interfaces defined with this framework can be automatically externalized as XML/XSD over HTTP, or
JSON over HTTP. Hypertext Transfer Protocol (HTTP), used properly, offers a convenient and widely-
adopted application protocol for implementing RESTful interactions. By defining standard conventions for proper
use of HTTP and combining those with a standard extensible datagram format (ION) and interface definition
language (SDL), we define a complete model for extensible RESTful service definition.

2 Contents
1 Abstract.............................................................................................................................................................................................. 1
3 What do you mean by REST? ........................................................................................................................................................... 2
3.1 Why HTTP? ......................................................................................................................................................................... 333
3.2 General Principals .............................................................................................................................................................. 433
4 HTTP Client assumptions .............................................................................................................................................................. 544
4.1 HTTP semantics .................................................................................................................................................................. 544
4.2 Methods.............................................................................................................................................................................. 544
4.3 Redirects ............................................................................................................................................................................. 544
4.4 Connections ........................................................................................................................................................................ 554
4.5 Retry .................................................................................................................................................................................... 554
4.6 Security ............................................................................................................................................................................... 554
4.7 Encodings............................................................................................................................................................................ 655

Copyright © 2008 Amazon Global Services. All rights reserved. |1


4.8 Mapping .............................................................................................................................................................................. 655
5 Resources....................................................................................................................................................................................... 655
5.1 Kinds of Resources ............................................................................................................................................................. 655
5.2 References as URL's ........................................................................................................................................................... 766
5.3 Resource Naming ............................................................................................................................................................... 766
5.4 Organizing your service ..................................................................................................................................................... 776
5.5 Versioning ........................................................................................................................................................................... 887
5.6 Concurrent Writes (PUT’s) & Resource Versioning ......................................................................................................... 988
5.7 Keys Vs URLs ....................................................................................................................................................................... 988
6 Using HTTP Methods..................................................................................................................................................................... 998
6.1 Definition of Standard HTTP Methods ............................................................................................................................. 998
6.2 Applying HTTP Verbs to Entity Resources ..................................................................................................................111010
6.3 Applying HTTP Verbs to an Entity Factory/Recycler ..................................................................................................121111
6.4 Applying HTTP Verbs to Collection Resources ...........................................................................................................131212
6.5 Applying HTTP Verbs to Algorithmic Resources ........................................................................................................131212
7 Protocol Semantics .................................................................................................................................................................141313
7.1 Caching..........................................................................................................................................................................141313
7.2 Content Negotiation .................................................................................................................................................... 151414
7.3 Authorize Header ......................................................................................................................................................... 151414
7.4 HTTP Redirect ............................................................................................................................................................... 161515
7.5 Using SSL .......................................................................................................................................................................161515
7.6 Cookies..........................................................................................................................................................................161515
8 Working with Structured Resources ......................................................................................................................................161616
8.1 Data Formats ................................................................................................................................................................ 161616
8.2 Why Ion/JSON (and not XML)? ...................................................................................................................................171616
8.3 Type Representation and Negotiation ....................................................................................................................... 171616
8.4 Interface Specification .................................................................................................................................................181717
8.5 Query semantics .......................................................................................................................................................... 211919
8.6 QUERY ...........................................................................................................................................................................232121
9 Multiple Views of Same Resource .........................................................................................................................................242222
9.1 Definitions ....................................................................................................................................................................242323
9.2 Observations ................................................................................................................................................................ 252323
9.3 Amazon REST Approach...............................................................................................................................................252424
10 Extended Use-Cases ................................................................................................................................................................ 272626
10.1 Batching ........................................................................................................................................................................272626
10.2 Partial Resource Updates ............................................................................................................................................292727
10.3 “RPC Operation” Using a REST Interface .................................................................................................................... 302828
10.4 Events / Notifications via HTTP ...................................................................................................................................322931
10.5 Workflows.....................................................................................................................................................................333031
10.6 Workflows....................................................................... Error! Bookmark not defined.Error! Bookmark not defined.31
10.7 Extending HTTP Methods ............................................................................................................................................333032
11 APPENDIX A: Glossary ............................................................................................................................................................. 343132
11.1 Normative References: ................................................................................................................................................353234
11.2 Non-Normative References:........................................................................................................................................363334
12 APPENDIX: Old Notes .............................................................................................................................................................. 363334
12.1 the resource URL .......................................................................................................................................................... 363334
12.2 Sample interactions from our white board discussions:........................................................................................... 383537
12.3 Questions:.....................................................................................................................................................................393638
12.4 Misc Notes: ...................................................................................................................................................................413839
12.5 user resources: ............................................................................................................................................................. 423940

3 What do you mean by RESTResource-Oriented?

 resource oriented, not RPC oriented


o because data lives longer than code.

Copyright © 2008 Amazon Global Services. All rights reserved. |2


o because you don't want the code in your service bound to code in your clients
 CRUD first, methods second, objects only in your code

The key feature principal behind resource-oriented interfaces is to consider the data artifacts, or resources, that
best represent the service-to-service state transition semantics.

In traditional RPC-based interface design, interfaces model the functional behaviors the service is trying to enable.
When one service invokes another service, it does this by invoking a command and a set of parameters. For
instance, a service which processes orders might have an interface that says “sendOrderForProcessing” with an
order-id as a parameter.

In resource-oriented interface design, interfaces are modeled based on the resources needed to define the
semantics of those behaviors. Instead of one service expressing a functional directive to another service with a
command and a set of parameters, resource-oriented interfaces focus on transferring state that represents the
change in the state of the world that one service wishes to communicate to another service (representational state
transfer is also known as REST). For the same example above, instead of expressing the command
“sendOrderForProcessing” a resource-oriented interface would allow you to express the change-in-state desired
by either updating a resource representing the state of the order, or adding the order resource to a queue
resource which representing the orders-that-need to be processed.

While this document focuses primarily on resource-oriented interactions, care and attention have been paid to
how resource-oriented service interactions can live along-side traditional RPC-style interfaces. is to expose your
service’s functionality as a set of resources. These are self contained data entities that representations of which
the service and its users exchange. In so far as the service allows, these entities are manipulated with the standard
CRUD operations – create, read, update and delete. When using HTTP as the communication protocol these
operations are mapped for onto the standard HTTP methods – PUT, GET, POST and DELETE.

As a practical matter HTTP is the most direct means of implementing this paradigm. The methods and the basic
semantics of the HTTP application protocol are the semantics of REST. (note: just because a service exposes it’s
functionality through HTTP does not mean it has a REST interface or is even particularly resource oriented.) In
addition the resources in a SOA environment are data. As such the content needs to support data serialization
well. For the data serialization JSON or Ion is the preferred format, JSON externally facing and Ion internally.

One key departure from some REST definitions is that this incarnation is not a “pure REST” approach. The bulk of
your service should be exposed as resources manipulated using the standard HTTP methods. In fact iIn most cases
the entire service functionality will be exposed as resources. However there are times when it is necessary to
expose functionality in a way that is really a procedure call. When it is, essentially, obfuscating to transform the
procedure invocation as a resource change then exposing the procedure directly is supportedTo accommodate
these situations, a syntaxmechanism for remote procedure calls is also defined by this specification .specification.

3.1 Why HTTP?


As a practical matter HTTP is the most direct means of implementing this paradigmresource-driven distributed
computing interactions. The methods and the basic semantics of the HTTP application protocol are the semantics
of RESTbased on the semantics of resource-driven representational state transfer. HTTP is a widely deployed, well-
tested, flexible, and well-supported application protocol. Further, the notions of URLs, HTTP headers, GET/POST,
HTTPS and caching constructs are widely understood by a large class of developers.

Copyright © 2008 Amazon Global Services. All rights reserved. |3


While it is possible and indeed common to express RPC-style interactions over HTTP (SOAP, XML-RPC, AWS Query) Formatted: No bullets or numbering
by effectively treating HTTP purely as a transport protocol, doing so often requires abusing the application-layer
protocol features in HTTP such as idempotency, cacheability, resource querying, and resource naming.

Because the application-layer semantics of HTTP have been overloaded to support use-cases outside of resource-
oriented interactions, we have found it necessary to revisit the HTTP specification and provide clarifying guidance
around how to use HTTP to achieve robust resource-oriented interactions. (note: just because a service exposes
it’s functionality through HTTP does not mean it has a REST interface or is even particularly resource oriented.) In
addition the resources in a SOA environment are data. As such the content needs to support data serialization
well. For the data serialization JSON or Ion is the preferred format, JSON externally facing and Ion internally.

3.1 Formatted: Normal

 HTTP as an application protocol is well tested, reasonably performant (when used appropriately), flexible, Formatted: No bullets or numbering
and well supported. (note HTTP is *not* a transport, if you're looking for a transport check out TCP or SMTP)
 HTTP’s semantics align well with the tenants of both supportable distributed systems and resource
oriented services.

3.2 General PrincipalsGuidelines when Designing Resource-Oriented


Interfaces
Resources in the context of the interface are self contained data values. Resources have keysare named, and
usually contain primary keys that uniquely identify them. Resources are the units of new and delete (in C++
terms). The boundary of a resource is the boundary of the effect of the methods - with DELETE being a particularly
useful method to understand.

Resources may be real or virtual. Simple CRUD services, like a simple rolodex service, may only need real
resources. Many services, like the Amazon Item Pipeline, expose virtual resources that map to underlying
resources indirectly.

While HTTP supports persistent connections this is strictly a performance optimization. HTTP is a stateless
protocol. That is neither the server nor the client require context beyond the single request for requests to be
valid and processed. This is an important characteristic to facilitate scaling, fail over, and should be maintained.

A commonly encountered example of state that has created problems for distributed systems is locking. Any form
of locking that spans requests - such as transactions - opens up this issue. The challenge is when you can remove
such a lock and what the state of the client might be if you do so. Techniques, such as eventual consistency or
optimistic concurrency or workflows, can be used to avoid this pitfall.

Use them.

Try to form the service as resource updates. Many operations can be framed as "store user input". The store then
triggers other work.

The user can come back for status on how the operation is proceeding including errors as appropriate.

However not all operations make sense as a resource. One simple example that comes to mind is "shutdown"
(turn off the server or service). While this could be implemented as some resource or resource property which
could be changed to trigger the shutdown it is really best suited to be an RPC. So use an RPC when a resource

Copyright © 2008 Amazon Global Services. All rights reserved. |4


appears to be highly contorted. But beware - this is the first step on a slippery slope. If you find that most, or even
many, of your operations are being expressed as RPC's it is most likely you have a problem.

The operations using the methods GET, PUT and DELETE are defined in HTTP to be idempotent – i.e. free from side
effects, or at least repeatable without problems and with the same final result as if the operation had been done Commented [ay1]: idempotent really does not have guarantee
only once. This property needs to be preserved. Again this facilitates distributed applications, especially retry about side-effect. PUT and DELETE are full of side effects.

since in many situations (especially in the face of network errors) the client is unable to tell if an operation they
started completed.

Idempotent operations can simply be repeated in this case without problem.

4 HTTP Client assumptions Commented [mwh2]: A strange and arbitrary subset of HTTP
functionality. Include: representations, checksum, compression,
conditional execution, tentative execution, pipelining, streaming,
4.1 HTTP semantics authorization, caching here? [taken from my HTTP summary doc.]

Clients must respect the HTTP semantics. This includes policies around caching and TTL's. This includes the
idempotence of the requests when appropriate (i.e.e.g. GET). And it should include use of HTTP headers when the
HTTP headers currently defined provide the functionality the service or client use. Identity, caching, encoding are
all examples where the functionality should be handled through existing HTTP headers.

4.2 Methods
The client code should be able to use the standard HTTP methods - including GET, PUT, DELETE, POST, and HEAD.
It should also be able to pass through extended methods. <(with the details of how covered elsewhere>)

4.3 Redirects
The client library (in whatever form is appropriate to the hosting language) should accept and handle redirects
from the service being contacted. We will likely want policies around whether further requests should "stick" to
the new base URL or revert back to the original base (there are use cases for both). HTTP supports both
alternatives.

4.4 Connections
We should have support for persistent connections as the normal interaction is one where multiple requests will
typically be used between the client and the service.

4.5 Retry
Clients are expected to retry appropriately on failed (timed out) requests. They are also expected to be well
behaved when retrying. "Well behaved" includes appropriate backoff schemes and termination of retry after a
reasonable period including support for the “Retry-After” header.

4.6 Security
The HTTP client should handle some aspects of security. This includes support for HTTPS for use in some
circumstances. Support for the client library (in the whole) should include support for validating security
certificates appropriate with Amazon policies. The client support should provide support for signing and or

Copyright © 2008 Amazon Global Services. All rights reserved. |5


encrypting all or appropriate parts of requests. And it should provide support verifying the signatures and or
decrypting signed of encrypted portions of the response.

4.7 Encodings
All of our clients should use UTF-8 and as the preferred character encoding. Exceptions to this should be limited to
services or clients that have to work with external services (or clients) and where UTF-8 encoding is not available.
(and fFor non-textual data, such as PNG resources, clearly UTF-8 isn't really meaningful for PNG resources) The
client should use the content type header to request the serialization in our internal formats - such as Ion binary or
text, BSF datagram, JSON, etc. In general XML is not appropriate internally as a data serialization format due to its
size of decoding complexity (if the data is XML then that's another matter - for example the Merchant data in
single feed format). When communicating with external parties XML certainly needs to be supported.

4.8 Mapping Commented [ay3]: Should we call this section as "Abstraction",


instead of "Mapping"?
The client code should be organized such that users can operate at the appropriate level of the stack. HTTP, REST
or even a higher level abstraction is one dimension. The ability to access the raw request buffers, or the serializer
form, or a higher level object is another dimension. While the higher level abstractions should be complete it is
likely there will be times when particular users will need to break the abstraction. If that is necessary we will need
to allow it, but also need to be able to know that this is what is taking place. The higher level mappingabstraction,
such as the object mapping layer, should be able to adjust to changes on the wire without the need to recompile
the clients. (there There are a number of ways to achieve this such as metadata driven runtime mappings)

5 Resources

5.1 Kinds of Resources


Roy Fielding’s thesis on representational state transfer defines a resource as “anything that can be named”. The
URI specification defines a resource as anything that "might be identified by a URI", which is therefore anything at
all that can be named. The HTTP specification defines a resource as "a network data object or service that can be
identified by a URI". In almost all cases, a resource name should be a noun (as opposed to a remote procedure call
operation, which should be a verb).

Resources are data that a service makes available to its users. The data may be encoded in any of a number of
data formats, but in general the service data in our Amazon services will be encoded in a way that the caller can
understand and operate onf it (as distinct from an undifferentiated blob). Biblio records, Tibco datagrams, XML,
JSON, Stumpy and Ion are all encoding techniques we use today.

These definitions are very generic and open-ended. To better describe common HTTP usage patterns, this
document classifies resources into the following categories:

Resource Kind Description

Entity A (typically indivisible) document or other structured state

Copyright © 2008 Amazon Global Services. All rights reserved. |6


Workflow Instance A resource that represents an autonomous process

Entity Factory/Recycler A manager of resources

Workflow Engine A special manager for resources that represent autonomous processes

Collection A logical group of resources

Algorithmic A resource that performs some (stateless) processing

[mwh: these are somewhat different than in Alex's glossary.] Commented [mwh4]: TODO
Commented [ay5]: TODO: to add “query” and “system”
5.2 References as URL's resources here, to clarify workflow related concepts here, to make
the terms here and the terms in Glossary in sync.
While a resource is data often a resource contains data that serves as a key to access other resources. And while
from an information- theoretic point of view the raw key values, such as an integer customer id, has essentially the
same semantic value as a URL used to fetch the customer using the id, the URL form is much easier to use in
practice. As a result when you are including references to entities that are themselves resources it is
recommended that you use the URL form when that is practical. And when the "raw" value is required it may even
be useful to include a redundant copy in the form of a URL. Commented [mwh6]: Is this a good place to bring in ACI ?

5.3 Resource Naming


I haven't had time to write my paragraph, but here are some bullet-points: Commented [ay7]: I guess we may want to improve the
format/style of this section ?

 Follow URI standard


 Hostname is insignificant Commented [mwh8]: Better wording? “not part of it” ? “not
 Hierarchical in nature a URL” ?
 does NOT imply that the resource is a file or that the URI maps to an actual file system pathname Commented [ay9]: I am not 100% sure about saying host name
 Within a path segment, the characters "/", ";", "=", and "?" are reserved. is insignificant.
 Relative references are ok Commented [mwh10]: But we should put some constraints on
 Subcomponents of the path do imply a hierarchy and may imply that the parent is also a resource, but how they are used.
Also vs. primary keys.
that parent may or may not be accessible
 File extensions are optional, but should not be used if content-negotiation is possible on a resource (e.g.
don't reference foo.xml if the Accept header is going to be text/json) Commented [mwh11]: Can/should we marginalize the
 URLs ending in ‘/’ should refer to a set or collections rather than be used to reference an individual entity references to files? Or are we treating files and business data as
peers in this doc?
 Entities named with the Amazon-Common-Identifier that are accessible via REST interfaces can and
should be be referenced using the full ACI such as http://example.com/amazon/amn1.asin.1.29847389 Commented [mwh12]: I don’t think I like this. Traditional
usage is that trailing “/” are ignored, right?

5.4 Organizing your service


Your service is accessed first through DNS, the DNS name portion of the URL. This will be mapped to a process that
can handle all URL's associated with it. The process will regularly distribute this work out in any number of ways. A
common form of this is using a load balancer to distribute requests across a homogeneous fleet for processing.
Partitioning requests based on the resource key or resource name is another common pattern.

Copyright © 2008 Amazon Global Services. All rights reserved. |7


In addition to the resources (both physical and logical) that your resources owns your service should expose some Commented [ay13]: “physical” and “virtual” terms were used
system resources. few sections ago.
Commented [ay14]: Another terminology is introduced here
Formatted: Normal

"System" Resources, the root/: Formatted: Font: Cambria, 12 pt, Bold, Font color: Custom
Color(RGB(79,129,189))
This is the unidentified resource, i.e. no path. This should return a simple human oriented page that could be used
Formatted: Font: 12 pt
at the starting point for ad hoc (i.e. developer) exploration of your service, offering links to documentation if that's
appropriate. Commented [ay15]: Do we want to deal with multiple services
hosted at one host? If so, all paths are will be prefixed with
"/serviceName".
Or this may just fail. A developer friendly page here is recommended.

"system" resources, status/status: Formatted: Font: Cambria, 12 pt, Bold, Font color: Custom
Color(RGB(79,129,189))
This is essentially the "ping" resource. As a minimum it will show that the service is operational.
Formatted: Font: 12 pt

"system" resources, schema/interface: Formatted: Font: Cambria, 12 pt, Bold, Font color: Custom
Color(RGB(79,129,189))
Formatted: Font: 12 pt
This is the base for "type" discovery. The schema for all the entities this service supports should be accessible
through this base. Generally the name of the user resource is a key that can be used to access the schema for that
resource. [cas: we might want an addition "sub directory" here like schema/resources or service Commented [mwh16]: Let’s see how that works out… SDL is
definition/schema] The schema version should be an optional key. oriented around a single rooted document, with the service owner
controlling layout (so at the very least it should be an SDL
convention, not a resource naming / framework convention).
/ping: Formatted: Font: Cambria, 12 pt, Bold, Font color: Custom
A simple liveness test that returns no content and a “200 OK” status when the service is functioning normally. Color(RGB(79,129,189))
Formatted: Font: 12 pt
In addition there should be information about the service itself accessible through this.
Commented [ay17]: What is the difference between “/ping”
and “/status”? Suggested to remove these 2 sentences.
For Ion binary the schemata should include the symbol tables used for serializing the content to facilitate sharing
symbol tables across calls. See general principals - don't share state :) Commented [ay18]: TODO: Clean this up

PUTting these resources might be able to be done with appropriate access control. This is left as an exercise to the
implementer. Commented [ay19]: Suggest to change the tone to: Access
control implementation is service implementation specific.
It should be possible to register for changes in some fashion. Either through versioning of the overall service (is
that the schema schema, or a value associated with 'status'?). Or a publishing stream. Or a poll-able interface. Commented [ay20]: TODO: clean this up

5.5 Versioning
Versioning is a feature that exists at (at least) 3 levels. The service API itself should support versioning so when the
semantic of the service changes sufficiently to effect the users a distinct version can be accessed. [cas: my current Commented [mwh21]: Of course SDL supports a multi-level
versioning schema, intending that you do it at both levels to most
take on service version and API versioning is this should be done at the service level, as opposed to the API level.] accurately capture what is changing and how.
Commented [ay22]: Moved the comment/question to here:
[mparthas: But doing it at the Service level is very coarse grained, isn’t it? What if only one API changes? Would it [mparthas: But doing it at the Service level is very coarse grained,
not be simpler for the clients if only that API has a different version and the others remain unchanged? ] isn’t it? What if only one API changes? Would it not be simpler for
the clients if only that API has a different version and the others
The definition of resources a service managed also change over time. This change is handled by schema remain unchanged? ]

versioning. Most service changes can be framed as schema version. Services should support older versions using
Commented [ay23]: We need to describe when one should use
Service-Level versioning and when one should use Schema-Level
versioning.

Copyright © 2008 Amazon Global Services. All rights reserved. |8


the specific content header. In the event the content-type header is unsuitable for this granularity we should have
another header to support this. Commented [mwh24]: Specified below.

Finally individual resource instances often require explicit versions. The instance version can be used to make
many operation idempotent, operations that would not otherwise be idempotent. In addition the instance version
can be used to enable optimistic concurrency - a non-locking concurrency model. In additionMoreover, under this Commented [ay25]: Just to avoid repetition of “in addition”.
model a service owner has the option of keeping a sequence of immutable instances to show how a resource has
changed over its lifetime. A tombstone can then be used to indicate that a particular resource has been deleted.

Versions should be monotonically increasing (i.e. always getting bigger). Schema versioning is defined as part of
the SDL work. Instance versioning is well handled by a reasonably large integer, which these days would be 32 or
64 bit.

Error codes and HTTP return codes

<see Mark's doc and IMS doc's - get links soon>

5.6 Concurrent Writes (PUT’s) & Resource Versioning Commented [ay26]: Is this section intentionally empty?

5.7 Keys Vs URLs Commented [mwh27]: I think this whole thing needs to move
upwards so that naming, URIs, URLs, ACIs, and keys are all treated
Resource entities can be identified by location or by name. Locations and names are separate concepts. A name is comprehensively in one spot.

location-independent. An entity with the same name might exist at multiple locations, or its location may change,
but its name is global.

Resource locations should be specified as URLs. Commented [mwh28]: We need some language about
“authorities” here.
Resource names should be specified using the Amazon Common Identifier. Resource locations may embed
resource names. For instance, http://example.amazon.com/browse-node/amzn1.bn.1.289913 is an example of a
location (URL) that embeds a name (Amazon Common Identifier).

6 Using HTTP Methods

6.1 Definition of Standard HTTP Methods


GET
GET is used to retrieve resources. The right-hand URL includes the resource name and the key parts necessary to
identify the resource. Resource might be real or virtual.

This should onlymay be used for query also for limited cases. A common one is to partially specify the key
returning multiple resources. Another is to add query parameters to filter the results. [cas: needs detail]

The key may include the version of the resource, if the services resources are versioned.

Copyright © 2008 Amazon Global Services. All rights reserved. |9


Lack of a version means ... [cas: do we want this to be "all versions", or "most recent", or error? If it's not "most Commented [mwh29]: I vote for this for the common case.
recent" do we should specify a system key for "most recent" such as "newest".] Commented [ay30]: “most recent” or “best effort most
recent”? And, mention the concept of eventual consistency?
PUT
PUT is used to create or update resources. Not all resources can be PUT. PUT requires the user know the primary
key for the resource. PUT resources should, in general, include a version of the resource that can be used to Commented [mwh31]: Or not, if all we want is eventual
facilitate optimistic concurrency. The version in the resource being put should be the new version - i.e. the old consistency, latest wins.

version plus one. This preserves cachability. [cas: this needs to be verified.] Commented [ay32]: How about the case of resources without
a version ID (e.g. most recent version)
The resource being PUT should be an exact image of the resource you would expect to GET using the same key. Commented [ay33]: Should or must? I can foresee a case that
This means PUT does *not* supports NEITHER partial update, nor NOR resources lacking their PK. extra data is added to the “PUT” result. (e.g. information retrieved
by other business logic; to resolve a URL, which points to another
resource, without version ID to become a URL with a version ID)
Not all resources can be PUT. PUT should generally be under tighter control than GET. For example, a read-only
query-based resource does not support PUT method.

There is no common contract on the time lag between PUT-ting a resource and when that version of the resource
will be returned on the corresponding GET. Note that when the GET includes the new version of the resource this
can be used to trigger "extra effort" on the part of the service to find that version (with the expectation that it is
"in progress", or perhaps just on another host). This trigger however must be examined with an eye towards DOS Commented [ay34]: Clarify what “examined with an eye”
risk and general cost. means?

The litmus test of whether something can be PUT is - will you GET what you PUT? If you can't GET what you PUT,
use POST. If GET gets you what you PUT, then PUT is the right choice.

DELETE
Mostly like PUT, but it removes the resource. Many services may wish to create a tombstone as the new copy of
the resource.

DELETE requires the full key, and the version for concurrency control when appropriate. Commented [mwh35]: See the previous “or not”

POST
POST is used to create a new resource when the caller does not specify the name of the resource. POST is also
used for a variety of other non-CRUD purposes. POST is not defined to be idempotent. This does not mean that
POSTs are *not* idempotent, but it does mean that requests that are not idempotent must be channeled through
the POST method. This includes non standard methods that might not be idempotent. It also includes some forms
of "create". Any time the service must assign the key to a resource and, therefore, the user cannot provide the key
a priori, the resource creation must be done using POST. Partial updates are also an example of an operation that
is not idempotent, and certainly not "cacheable", so again partial updates must either be done through POST or be
an independent resource in their own right.

A common use for POST is to create a resource that the service assigns the identity to. For example a service that
manages contacts might assign an immutable contact_id to each contact. This id is programmatically determined
by the service, the client has no way to know what the contact_id will be for the new contact. As such the client
cannot PUT the new contact as the URL of the contact will include the id, which is not yet known. A POST to the
same URL (leaving off the id) could be defined by the service to create a new instance of the resource and makes
its content be the body of the POST. This call would then return the new resource, or resource location, to the
caller.

Copyright © 2008 Amazon Global Services. All rights reserved. | 10


6.2 Applying HTTP Verbs to Entity Resources
GET
returns a representation of a resource given the resource's name.

URI full name of the entity resource

->

response code Success or failure

body representation of the entity

PUT
creates or updates an entity resource. It is used when the caller knows the full name of the resource.

URI full name of the entity resource

body representation of the entity

->

response code Success or failure

The value provided should be the full and exact representation you want returned by subsequent GETs. Some
kinds of entity resources might not support PUT.

DELETE
destroys an entity resource. It is used when the caller knows the full name of the resource.

URI full name of the entity resource

->

response code Success or failure

Note that services are permitted to effect deletion by creating a tombstone version of the specified resource.

A workflow instance is a special case of entity that represents an active autonomous process within the system.
Such workflow instances have resource names, and their current state can be obtained by a client using GET. PUT
and DELETE might also be supported by workflow instances.

Copyright © 2008 Amazon Global Services. All rights reserved. | 11


6.3 Applying HTTP Verbs to an Entity Factory/Recycler
POST
creates a resource via the factory. It is used when the caller does not know the name of the resource.

URI full name of the factory resource

Body representation of the entity

->

response code Success or failure

Location header typically contains the name of the new resource

DELETE
destroys an entity resource via the recycler. The caller must know the name of the resource. Commented [ay36]: I understand the need of doing a POST
against a “factory” URL to create a new instance. However, I am not
sure about the necessity of do a DELETE against a “recycler”, unless
URI full name of the recycler resource
we are talking about deleting multiple resources here. Why not just
do a DELETE against the resource directly?
query parameters or body name (or characteristics) of the target resource

->

response code Success or failure

GET
returns status information about the factory or recycler. Support for GET for factory/recycler resources is optional. Commented [ay37]: Similarly, I am not sure the need of doing
a GET against a “factory” or “recycler”.
URI full name of the factory or recycler resource

->

response code Success or failure

Body representation of the state of the factory/recycler

A workflow engine resource is a special case of a factory/recycler resource. It supports the same HTTP methods,
but the resources it returns are workflow instances (see above).

Copyright © 2008 Amazon Global Services. All rights reserved. | 12


6.4 Applying HTTP Verbs to Collection Resources
GET
returns the representations of a subset of the resources contained in the collection.

URI full name of the collection resource

query parameters and/or description of which resources to return


body

->

response code Success or failure

Body list of 0, 1, or more resource representations

Note that a single resource might function as both a collection resource and a factory/recycler resource. In this
case, GET would typically support collection query operation.

PUT
It is typically not appropriate to PUT (completely replace) a collection resource. If your application supports it, PUT
should be used as in the entity resource case.

6.5 Applying HTTP Verbs to Algorithmic Resources


POST
is the primary HTTP method used to interact with an algorithmic resource.

URI full name of the algorithmic resource

query parameters and/or parameters to the algorithmic resource (optional)


body

->

response code Success or failure

Body results of the operation (optional)

GET
can may be used when the algorithmic resource is stateless, has no side-effects and returns the same resource for
any given input. This will make the results cachable. When using GET, the HTTP request MUST NOT contain a
body. Alternatively, GET can be used to return status information about the algorithmic resource itself. In this
case, GET should be used as in the entity resource case. Commented [ay38]: This use case may make the picture a bit
too muddy if we cannot think of more justification to allow such a
GET method usage.

Copyright © 2008 Amazon Global Services. All rights reserved. | 13


Other Notes
All services SHOULD support the HEAD HTTP method. HEAD behaves exactly the same as the equivalent GET,
except that it does not return a body.

Note that creating or updating a resource with PUT or POST does not always imply that a representation of that
version of the resource will be immediately available via GET. Eventual consistency and other considerations might
delay the availability of recently updated resources. Details of such behavior are service-specific. mparthas:
Usually, a 202 Accepted response code is returned to the user (with the location url containing the link for the
client to try at a later time] Commented [ay39]: Moved to comment section:

The HTTP specification further describes the behavior of the primary methods: Requests for safe HTTP methods do mparthas: Usually, a 202 Accepted response code is returned to the
user (with the location url containing the link for the client to try at
not change the state of resources on the server. GET and HEAD are intended to always be safe. Requests for a later time]
idempotent HTTP methods can be repeated without causing additional side-effects. GET, HEAD, PUT, and DELETE
are intended to always be idempotent. Service owners MUST conform to this behavior.

More details on these interaction patterns, including details on the returned HTTP status codes, can be found in
the document "HTTP Overview for REST-Style Interactions".

7 Protocol Semantics

7.1 Caching
HTTP 1.1’s caching constructs are powerful tools for optimizing RESTful interactions over HTTP.

Services should SHOULD take full advantage of HTTP 1.1 caching primitives for all GET requests including GET-style Commented [mwh40]: Did I go overboard on capitalization ?
requests that use the query-string field parameters. For simplicity and consistency, services should notSHOULD
NOT allow caching of PUT, POST, and DELETE operations by explicitly including a "Cache-Control: no-cache" in all
responses. Further, Cache-Control should SHOULD be "public". max-age: and max-stale should be avoided.
[mparthas: need to provide details on why these should be avoided]

For GET requests, services should SHOULD specify the Cache-Control, Last-Modified, ETag, and Vary headers in the
response. If the item can be cached, the server should SHOULD set a reasonable Expires time.

Servers must MUST support conditional GETs and return Not-Modified as appropriate. The ETag should SHOULD
be set such that it uniquely identifies a particular version of the resource for all references to that resource
independent of which individual host or service is vending that resource. If an ETag can not be guaranteed to be
the same for all instances of the same resource, then the ETag should notSHOULD NOT be used. The ETag should Commented [ay41]: What kind of “instances” are we referring
SHOULD be generated in a way that is cheap to generate and cheap to compare. For instance, at build time, a to here? Machine instances in a cluster environment?

checksum could be generated for read-only file resources and stored as the ETag value.

The Last-Modified header should SHOULD return the last time which the resource was actually modified in its
source form, and not the time at which its copy was most recently written to disk. Care should be taken to
preserve mtime on UNIX-based file resources across various deployment methods.

The Vary header must MUST be used to specify which header fields can invalidate a cached object when they
change. For instance, the Amazon Global Action Trace may MAY differ from request-to-request, but should

Copyright © 2008 Amazon Global Services. All rights reserved. | 14


notSHOULD NOT invalidate the cached object. However, a different Authorize header may mean that a cached
object is no longer valid. The simple rule is "if the cache-key for an HTTP cache should include a header in addition
to the URL, put that header name in the Vary field."

Services must notMUST NOT assume that client caches can be invalidated. The only guaranteed way to force a
refresh of an item with an Expires header set is to update any references to that content to point to a new URL for
that resource. Because of this, unless it is possible to update all references, Expires headers should notSHOULD
NOT exceed 5 minutes in the future before a revalidate is required. Commented [ay42]: Why “5 mins” particularly?

Clients should notSHOULD NOT override server cache-control headers. If a client does choose to override the
caching directives, the client MUST make proper use of the Client-Controlled-Behavior specification in section
13.1.6 of HTTP 1.1 including the cache-request-directives in Section 14.9.

7.2 Content Negotiation


A resource can have multiple representations. A representation is an encoding of the state of the resource, such
as text, JSON, or JPEG. The encoding of a representation is specified by a media type. The standard media types
can be found at http://www.iana.org/assignments/media-types/index.html.

[ed. Add media types for Ion and other Amazonisms]

Clients and services MUST identify the media type of the body of all HTTP requests and responses in the Content-
Type header.

Services MUST recognize the Accept, Accept-Charset, and Accept-Language HTTP request headers. The service
MUST respond with a representation that matches the requested encoding. If a service is unable to encode the
resource in a compatible representation, or accept a resource in the given representation, it MUST return an error.

7.3 Authorize Header

Generally, this: http://docs.amazonwebservices.com/AmazonS3/2006-03-01/index.html?RESTAuthentication.html,


is not a bad start. In version 2, we actually do it pretty well. There are a few key items: Commented [ay43]: Change the writing style of this section?
Make it into a normative one (using MUST, SHOULD, MAY) and
avoid a conversational tone?
1. Every request must be authenticated. You can have sessions, but they are very easy to compromise by a
Commented [ay44]: “session” is an overloaded therm. To
man-in-the-middle (MITM) attack define “session” as a part of request token that carries
2. If you use SSL, you can avoid authenticating each request, but the overhead is minimal to just authentication semantics across multiple requests.
authenticate everything.
3. The request should be signed with a verifiable key, derived from the user’s authentication token (I am
explicitly avoiding the term password). Asymmetric signatures are preferred, but symmetric ones work if
must be. The problem with using symmetric crypto to sign the message is that it means the key must be
rotated more frequently.
4. The entire request must be signed. This includes all parts, including any delimiters used in the URI. The
simplest option is to sign everything except the signature
5. Preferably, a session key should be negotiated and used for the signature.
6. The time stamp must be included in the signature to prevent replay attacks. A reasonable time frame
must be chosen within which a request is considered valid. Keep in mind, however, that perfect time

Copyright © 2008 Amazon Global Services. All rights reserved. | 15


synchronization is nearly impossible. To avoid problems with replay requests, all requests should follow
ACID properties, and also be idempotent. Beyond those considerations consider requests that make
changes and see if they can be made more resilient. For instance, let us say we want to transfer $100
from someone’s bank account to mine. We could send a single request saying “transfer $100 from bob to
jesper.” Or, we could send a request for each person’s balance, and then one after that which says
“please set bob’s balance to $483.89 and jesper’s balance to 324.64.” The latter is impervious to replay
attacks, but requires more overhead in terms of both locks and requests. Which one to use depends on a Commented [ay45]: I do not agree that this kind of
lot of factors, including whether the request can be protected from replay attacks some other way. idempotent nature of the latter example is actually more resistant
to replay attacks.

TODO: OAuth Text Protection against replay attack would come from enforcing a
version-# (from a version-# series) or a transaction-# (e.g. GUID)
embedded in a request.
7.4 HTTP Redirect
Consider that one transfers some money from Bob’s account to
Jasper’s account. Hence, this message of setting Jesper’s balance to
7.5 SSL 324.64 happens. Then, some request transfer money from Jasper’s
account to John’s account and setting Jasper’s account down to
$224.64. Now, I replay an earlier message. Jasper’s account is now
TODO: Clean up. back to $324.64 (!!!).

Assuming both Bob’s and Jasper’s account are under the same
I can’t think of a single case where it is NOT OK to use SSL. Where it is OK to NOT use SSL is usually in requests that resource manager. ACID nature can be easily achieved. Protection
against replay-attack may be done by establishing a 2-phrase
contain no sensitive information, and that are not sensitive to replay or man-in-the-middle attacks. Generally, if
business protocol: (1) the client to ask the service to request unique
you need to use sessions at all, you are starting to get into a situation where SSL is important. If it is a valuable # (version # or transaction GUID # - the number series itself does
transaction, or you are transmitting sensitive data, you really need to use SSL. not need to have secure-random nature); (2) the client needs to
include this unique ID in the actual transfer request payload. This
unique ID is a part of data that is signed. Any replay of the same
It is typically not acceptable to skip SSL if you encrypt the payload or the headers. The reason is that SSL is session unique ID will be either simply ignored or replied with a failure
message.
based, and encrypts the entire session. It is much more difficult, therefore, to replay an SSL protected transaction
Commented [ay46]: We are explicitly against resource locking
than one that is protected within the headers. The one that is protected within the headers requires the bad guy to
in our overall REST design principle. We think about this kind of
only capture a single packet to replay in the worst case scenario. That is not going to be possible with SSL. issue further.
Furthermore, since SSL uses session keys once the session ends, the server’s ability to parse a replayed request is Commented [ay47]: Fill in the reasons behind why we are
lost, reducing the need for strict time synchronization. Strictly speaking, you can provide the same level of security using HTTP Redirect in the REST context. Also, mention the
restrictions that a service should enforce to avoid any security
with a custom protocol to sign and encrypt the headers, but by doing so you would end up re-inventing SSL, so you implication of HTTP redirect
may as well go with that in the first place.
Commented [mwh48]: I’m not sure how complete we’ll get
here. Could mention oauth, message signing and encryption (as
7.6 Cookies opposed to SSL), and the IAA project.

RESTful interactions are intended to be stateless. Cookies create shared state. Further, cookies are a server- Commented [ay49]: Personally I do not consider cookie exactly
initiated shared state that is transparent to the application. as a state sharing mechanism. I consider that as a client side state
storage mechanism. From service standpoint, request from a
cookie-capable client can be 100% stateless as well.
Because of this, HTTP cookies, while convenient, break the semantics of RESTful interaction. If a services chooses
My reservation about cookie is about: some HTTP client is not
to create a mechanism for shared state between requests it can do so as a resource (e.g. a session resource) . cookie capable (intentionally). It complicates the REST interaction
more (e.g. cookie-path, expiration, and security related
implication).

8 Working with Structured Resources Commented [ay50]: A new item?

8.1 Data Formats

Copyright © 2008 Amazon Global Services. All rights reserved. | 16


The first is raw undifferentiated user data such as PNG files, true HTML content, and the like. These should be
handled by following the well known HTTP standards.

Most services, especially those internal to our systems, are passing resources that the clients will need to
understand in detail. As such the resource needs to be encoded in an understandable way. This requires a
serialization format that is machine independent, reasonably flexible, reasonably capable, and reasonably efficient.
Example serialization formats include XML, JSON, Ion, Biblio Records, and Tibco datagrams (aka BSF
Datagram/Dictionary).

The choice of the serialization format we offer to our outside customers is dictated for the most part by customer
requirements. That is for services that are our product to our customers

Typically this will be JSON for users who prefer REST interfaces and XML for those who prefer SOAP. A few years
ago other serializations would have been the "right" choice, perhaps DCOM, perhaps "Excel", perhaps CORBA or
ASN1. A few years from now another format will be the "right" choice. In general we will need to support
multiple serializations for our customers. And the "right" choice will be driven first by customer demand, and
second by technical merits.

Between Amazon owned services the issues are different in the technical considerations have much more Commented [mwh51]: This is a key justification for all of this,
important, in part because the sales aspect has a lower priority and issues like TCO (both hardware and wetware) not just for Ion. Be wordy here.

play a bigger role. Inside the firewall Ion is the preferred format.

8.2 Why Ion/JSON (and not XML)?


 JSON as a simple, performant, and popular data serialization format.
 Ion for a semantically richer, better specified, and more compact internal representation. Ion is a strict
superset of JSON and is a JSON parser (per the JSON specification).
 Ion also has support for path expressions and schema validation.
 not XML due to complexity, bulk and operation cost XML is a document format not a data serialization
format, despite its use for data.

8.3 Type Representation and Negotiation


A representation of an entity might correspond to a specific version of a type definition. If a client or server has
this information available it SHOULD pass that information in the x-amzn-type-version HTTP header. The value for Commented [mwh52]: Drop the “-version” for simplicity ?
that header MUST be a fully-qualified SDL definition name (e.g. "[email protected]"). This header MUST Commented [mwh53]: I rejected using content-type for this.
be included in the VARY header to properly support caching. Do we actually agree on that?
Commented [ay54]: Will that work in the case of XML? I know
Advanced services might support viewing a resource through multiple versions of a type. Such services SHOULD Ion is recommended by this doc for internal communication. And,
JSON is recommended for external communication. But, we may
support type negotiation. When type negotiation is supported, the client MAY pass an x-amzn-accept-type HTTP not be able to rule out XML completely for external
header with a list of the fully-versioned SDL types they support. The server MUST return data conforming to one communication.
of those types, or return a 406 response code.
I guess we should add after the MUST statement: “when the entity
payload is encoded in Ion format.“
HTTP Header Use:

content-type and accepts


this should be of the form:

application/<schema>-<format>

Copyright © 2008 Amazon Global Services. All rights reserved. | 17


[mwh: I talk about schema negotiation below, and don’t do it this way.]

where

<schema> is the versioned Ion schema that defines the content body.

<format> is the serialization format being used - ion, ion-binary, ion-text, or JSON (or xml).

[cas: these constants need to be examined carefully, but the set of choices should be reasonably stable.]

by versioning the schema here we can often (but not always) avoid versioning the API itself.

NOTE: this interaction should be tested - how do intermediate caching tools handle the content-type and accepts
headers?

security related
cache control related Commented [ay55]: TODO: fill in more?

8.4 Interface Specification


This proposal emphasizes JSON and Ion as the preferred data formats for entity representations in REST
interactions. The new Service Definition Language (SDL) (see https://dse.amazon.com/?SDL) is the preferred data
schema and service interface description language for REST services. The schema definition portion of the SDL
supports multiple data formats, but is designed to fully and directly support JSON and Ion, as well as open content
and extensibility. The interface definition portion of the SDL is used to represent the operations that are supported
by a service. The SDL also implements a carefully designed hierarchical naming and versioning mechanism, which
in turn facilitates schema and interface evolution. All Amazon.com REST services SHOULD [ed. MUST?] describe Commented [ay56]: SHOULD is fine, as we are saying SDL is
their service in SDL. the preferred schema format, not the only schema format in this
doc.

Example: Commented [mwh57]: Judgment call on whether or not to


move this to an appendix…
package:: {

name: "com.amazon.EntityService",
major_version: 1,
minor_version: 0,
rest_http_uri: "EntityService",
entries: [

package:: {

name: businessObject,
major_version: 1,
minor_version: 0,
rest_http_uri: "business-object",
entries: [

type:: {
name: BusinessObject,
major_version: 1,
minor_version: 0,

Copyright © 2008 Amazon Global Services. All rights reserved. | 18


base: struct,
fields: [
{
name: id,
type: string
},
{
name: data,
type: string
}
],
},

type:: {
name: NotFoundException,
major_version: 1,
minor_version: 0,
base: struct,
fields: [
{
name: reason,
type: string
}
],
rest_http_error_code: 404
},

type:: {
name: UpdateFailedException,
major_version: 1,
minor_version: 0,
base: struct,
fields: [
{
name: reason,
type: string
}
],
rest_http_error_code: 409
},

operation:: {
name: get, Commented [ay58]: What if people using an operation name
major_version: 1,
beyond HTTP verbs “get”, “put”, “delete” and “post”? How do we
minor_version: 0,
describe RPC call in SDL in the context of REST?
in: string,
out: BusinessObject,
exceptions: [ NotFoundException ],
rest_http_method: get,
rest_http_uri_data: "{in}"
}

operation:: {
name: put,
major_version: 1,
minor_version: 0,
in: BusinessObject,
out: void,
exceptions: [ UpdateFailedException ],
rest_http_method: get,
rest_http_uri_data: "{in.id}", Commented [mwh59]: I am rethinking this stuff (syntax-wise).
rest_http_body_data: "{in}"
Might want to add a note that it’s for demo purposes only, and
}
subject to change.
] Commented [ay60]: Will “{in}” overwrite “{in.id}”?
}

Copyright © 2008 Amazon Global Services. All rights reserved. | 19


This SDL definition represents the EntityService's interface as the top-level package definition. The EntityService
manages one kind of entity, which is represented by the nested package. The nested package defines one data
type, two exceptions, and two operations supported by the service for this kind of entity.

The "rest_http_uri" fields are concatenated to produce the base URI path for an operation, in this case
"EntityService/business-object".

The SDL is not REST-specific. It supports a general model of operations that take one input value and produce one
output value. With REST, however, input data can come from several sources: trailing components of the URI,
query parameters, and the HTTP message body. The "rest_http_xxx_data" fields map these various inputs onto the
single operation "in" data value. The "{in.a.b.c}" syntax is a minimal path language specifying a location in the input Commented [ay61]: I am for this feature in general. But, be
data value. This syntax also permits multiple path components to individually contribute input data (e.g. careful of the slippery slope. And, we want to make sure this
minimal path language would be a clean subset of the general
"{in.merchantId}/{in.merchantSKU}"). Note: this example does not (yet) represent our suggested best practice for query language that we would define in future.
specifying service interfaces in SDL.

Note: the SDL's REST semantics are not currently fully specified. This will be rectified in the near future. Commented [ay62]: I think we need a formal SDL document
that describe what “rest_http_*_data” attributes means in SDL.
Note: currently the SDL is not integrated directly into our service frameworks. As such, use of SDL does not The description here serves as a primer or an example. That is fine.
But, we need more formal description, I think.
currently imply any particular run-time support.
Commented [ay63]: I think we can leave this negative
comments out for now.
All REST services at Amazon SHOULD export an SDL schema document at
"http://host:port/ServiceRootPath/interface". Commented [ay64]: Using “*Path” term would make it more
familiar to Java Servlet API users.
This SDL document should contain definitions for all types and operations understood by the current version
service, including all previous versions of those entities.

[mwh: transitive closure? or just those for which it is the authority?]

[mwh: if we had a central repository, we could return -only- the names of the top-level definitions. do we want to
prepare for such an eventuality?]

todo: returning type/ver for data (to look exactly like entity name versioning)

8.5 Binary Data Formatted: Heading 2

Single instances of a binary data object, such as a JPEG file or a block of raw data for encryption, SHOULD be
represented directly in the body of an HTTP message, using the appropriate “Content-Type” header.

Complex data structures containing binary data, for example a struct containing a few strings and one large blob,
can be problematic. Use of binary Ion is the preferred mechanism for accommodating this situation, since it
represents mixtures of various kinds of data, including blobs, with minimal space overhead. Text formats, such as
text Ion and JSON, would require encoding the binary data in a representation such as base64, which has high
space and processing overhead for large data objects. For simple, flat data structures, the “multipart/mixed”
media type MAY be used as an alternative to enable binary data to be efficiently encoded.

Copyright © 2008 Amazon Global Services. All rights reserved. | 20


9 Queries and Views Formatted: Heading 1

Commented [ay65]: Re-numbering the sub-sections here to


group all Queries and Views related text under one big section.

8.59.1 General Query semantics I am still re-organizing content under this big section.
Commented [mwh66]: Two sections here on Query, neither of
Per RFC 3986, a query forms part of a URI as follows:
which are Alex’s latest.

<scheme>://<authority><path>?<query> Commented [JH67]: There should be a section on general


query syntax, name=value pairs query syntax, and Alex’s proposed
rich query syntax.
Commented [JH68]: Text should undergo another revision to
address (1) generic query discussion, (2) name=value pair query
The query part of the URI is intentionally a non hierarchical component of the path (RFC 3986 3.4) that “Serves to discussion, and (3) rich query discussion.
identify a resource within the scope of the URI’s scheme and naming authority”. The format of the query
parameter is unspecified in RFC 3986, but traditionally is a collection of “name=value” pairs, separated by “&”
and/or “;”. This document proposes both a traditional “name=value” syntax and a rich syntax. The list of
“name=value” pairs should be transformable into the richer syntax and exist as a shorthand should both be
supported.

In general, Tthe query part of a URI exists to provide additional parameters for a resource request, in particular, in
the context of CGI requests where the path component identifies a CGI script, and the query parameters provides
input into that script.

Whilst the query parameters provide great flexibility, the implicit non-hierarchical nature of the query provides a
substantial drawback in that it prohibits trivial remapping of resources. In particular, RFC 3986 relative reference to
a URI considers a query to not be part of a relative reference.

A RESTful URI on the other hand is by intention, hierarchical in nature. For example,
“/weblogs/myweblog/entries/100” considers “100” to be most specific, and “weblogs” to be least specific.
Similarly, “/Universe/Earth/37.0,-95.2” considers the coordinates “37.0,-95.2” to be a more specific location than
Earth (but is specific to Earth).

A benefit to hierarchical URI’s for web services allows trivial redirection for specialization. For example, in the
context of a service such as Sable, valid RESTful URI’s include “item/v1/4/3551551677” to identify item scoped
data (“item”) in region “4”, with ASIN “3551551677”. The same entire fleet could choose to process data in region
“1”, or may choose to specialize at the region level of the path. Doing this specialization on a non-hierarchical Commented [ay69]: Where does this “1” come from? “v1”?
query such as “sable?scope=item;region=4;asin=3551551677” would be cumbersome and prone to errors. Commented [ay70]: Both choices are “region” levels?

There is however scenarios that a query does make sense. For example, the Product Aggregator Service applies Commented [ay71]: Chris Suver prefers something more
filters on its output data using a concept called “facets” to reduce what data is retrieved by a client. Likewise, there similar to “?page=3”. Allow the server side to control the size of the
page.
are a number of applications where it is desirable to paginate output, such as “?start=1;count=20”. Both of these
Commented [ay72]: It does not define a new entity resource.
fall into the same scope of database queries (i.e. they can be mapped to a SELECT statement and/or WHERE But, it does define a new query resource.
clause) and do not in themselves define a resource.
Commented [ay73]: This “MUST NOT” restriction is not so
practical. This kind of restriction does not exists in other parts of
Correct use of Query Parameters query world.
1. An individually identifiable resource MUST NOT be identified by a query parameter. (e.g. “select * from employee where empNo=123)

 As a resource MUST be hierarchical in nature, it is not appropriate to fit it into a non-hierarchical At most, it is just a “SHOULD NOT”.
component of a URI
Alternatively, we could say: the resource, where the query targets
at, SHOULD not contain a data set which is too large. (i.e. a smaller
data set automatically c

Copyright © 2008 Amazon Global Services. All rights reserved. | 21


 Standardizing on the representation of resources in a uniform manner also shows consistency
across different services.
 Moving resource identification out of query parameters frees up the use of query parameters for
its originally intended purpose.
2. A version of a resource MUST NOT be identified by a query parameter.
 A version of a resource is itself a specialization of that resource.
3. Pagination SHOULD be identified by query parameters.
 This represents a special case of a filtering WHERE clause in a SQL query.
 Pagination is a filtering of a view of a resource and does not represent a different resource.
 The data returned as a result of pagination SHOULD provide an authoritative URIs to retrieve the
data that is paginated. For example, if a list of ASIN’s are paginated, a URIs should be provided
along with that ASIN to retrieve the same or further data of that ASIN.
4. Inclusion/Exclusion Filtering SHOULD be identified by query parameters.
 This represents the general case of a filtering WHERE clause in a SQL query, and is a more generic
form of the Pagination case.
5. Qualifying information MAY be specified by query parameters. Commented [ay74]: “Qualifying information” – this term is
very vague. Do we mean “unique key information” (full key or
 For example, in Product Aggregator Service, a query on an ASIN (the resource) will always return
partial key)?
information about that ASIN if the ASIN exists. However, the type of data returned may be
further qualified by specifying other context information such as an exclusive merchant ID, or a
customer ID.
6. Quantification of returned data MAY be specified by query parameters. Commented [ay75]: Would it better to say “Projection”?
“Quantification” is again very generic and vague.
 This is analogous to specifying columns in a SQL “SELECT” clause, and controls how much
information is returned for a given resource. Again, using the Product Aggregator Service, a
subset/superset of the information that would normally be returned for a resource can be Commented [ay76]: For “subset” feature, I am not surprised
controlled by specifying a list of “facets” (types of information) and “detail level” that cannot in and I am for it. However, for “superset” feature, we may need more
use case elaboration. Because, it could means a lot of features can
themselves be described as resources. come in through “superset” gate.
7. Authentication information MUST NOT be specified by query parameters.
 These must be specified using the correct HTTP headers.
8. Session information MAY be specified by query parameters. Commented [ay77]: Use case justification?
 Although a RESTful interface SHOULD NOT be session based, there are business cases where data
must be qualified by some session information. It is assumed that some additional service (even
if theoretical) provides the means of validating that session information and potentially mapping
that data into more detailed session information.

Name=Value List Query Parameter fFormat


The following conventions SHOULD be used when specifying a query parameter:

1. A query parameter consists of “name=value” pairs, where both the name and value are individually
encoded using only the RFC 3986 “unreserved” characters. To avoid ambiguity, the “name” portion must Commented [ay78]:
start with an alphabetical character and consist of only the set of characters “a-z”, “A-Z”, “0-9”, “_”, “.” Why related to ambiguity? I guess it’s more related syntax
simplicity?
and “-”.
o Characters outside of the “unreserved” range in the query-string MUST cause the request to fail. Why allow white spaces? Why allowing “-“? The “-“character
usually gives some syntax in other programming languages later.
2. The “name=value” pairs are separated by “;”. And, why not “_”?
 “;” MUST SHOULD be accepted and is desired per W3C recommendation. Commented [ay79]: If we may accept “&”, we can only say
 “&” MAY be accepted to support traditional use “SHOULD” for “;”.

Copyright © 2008 Amazon Global Services. All rights reserved. | 22


3. The order that names are listed SHOULD be consistent when auto-generated, but MUST be allowed in any Commented [ay80]: Reason is for requesting signing?
order.
4. Multiple instances of a name (e.g. “include=5;include=6;include=7”) MAY be allowed, but a request
specifying multiple instances when it is explicitly not allowed MUST cause the request to fail.
 In particular, it is important for security reasons that a service that can only handle one instance
of a name (e.g. “pageFrom”) must not attempt to arbitrarily pick one from a list if a list of
matching names are given.
5. Additional query parameters that currently have no assigned meaning SHOULD be allowed.
 In keeping with the philosophy of REST, a future definition of a query parameters should
gracefully degrade when presented to a pre-existing older service. Likewise when a parameter is
deprecated. There are however arguably some exceptions to this rule.

8.69.2 QUERYRich Query Syntax Commented [JH81]: This should be filled in with Alex’s latest
document
Most services have some form of query. At a minimum GET is a very specialized form.

list s/b defined by SDL - that is if your result may be a list it should defined as such, if it is always going to be a
singleton then it should not be a list.

Just as "list of" should be defined in the services SDL so should a URL as a value (and as distinct from pk's) - so we
need an SDL type or URL's.

There are three forms of query:

straight GET - where the PK information is included in the URL and it retrieves a specific resource, while this is
often not thought of as query in the rest community, it certainly is when viewed from a database 'theory' point of
view, and shouldn't be overlooked at such.

GET with query parameters - here the base URL does not specify a singleton resource but a "root" or view (i.e.
table) of some kind. The query parameters define a filter (details should align with current use) that restricts the
return to a subset of the "multi-entity" resource.

and "real query" - this is the most general case, and should be embedded in a POST as a specific method, say ...
"QUERY".

[ note: One question to address is the query language. There is a long history of individual services offering
individual query langauges. General purpose query, such as SQL or XQuery is a bit hairy as they offer the ability to
write arbitarily expense operations. We should have some suggestions here and some alternate methods for
alternate languages. At a minimum we need an idiom for including the query language in the URL, the header, or
the SDL entry point definition. Including specifying "subsets" of various standard languages. ]

A variant on the full "query" is filter. This is a step above the query parameters, which really only offers "and" as a
conjuntion. It would be allow a more complete boolean predicate but not offer projection (changing the shape of
the returned contents) nor joins (self or otherwise, but where multiple tables or views are used to specify the
results). Joins in particular tend to generate expensive query plans and are more complicated to implement.

Copyright © 2008 Amazon Global Services. All rights reserved. | 23


[ note: we could easily offer, ala IonPath expressions, a suitable filter/predicate grammar and implementation.
That is for any return value that returns a list of something the app developer could mark it as "support filter" and
our general server software could support a simple form of filter. More advanced developers could intercept this
and push the predicate back, clear to the db if that was appropriate. In either case it would generate more flexible
entry points and considerable consistancy. The JIT'd path expressions (in proof of concept only at this point)
would make even the trivial version reasonable.]

99.3 Multiple Views of Same Resource Formatted: Heading 2

As well as a natural view of a resource, there can be other views of a resource. Characteristics of such views can
include:

 Filtered View – only a subset of a view of a single resource is returned. This view may be filtered for, e.g.
performance reasons (reducing bandwidth) or security reasons (reducing visibility to sensitive fields).

 Projection View – a single resource is projected onto a different shape. A natural projection of a view is to
handle different versions of a schema. It is possible that additional resources are joined in this projected
view (e.g. as done by Product Aggregator Service).

Such views may be pre-configured, or potentially may be driven by a customizable view mechanism. Commented [mwh82]: Some language about whether same or
different resource. That views have (or don’t have) schema. But
then what about dynamic views?
9.19.4 Definitions
This section serves to create working definitions for the discussion that follows.

Natural View
This is the most natural view of a resource, and would be equivalent to the SQL expression “SELECT * FROM
<table> WHERE <primary-key> = <resourceId>”.

The schema from this view will be considered the primary schema.

Alternate Version View


It is assumed that a natural view may potentially have previous versions of that view. It therefore will have a
different schema, but there is bi-directional functional mapping between the two views.

Filtered View
This is a reduction of the natural view, and would be equivalent to the SQL expression “SELECT field1, field2, …
FROM <table> WHERE <primary-key> = <resourceId>”.

This view uses the same primary schema as the natural view, therefore imposes the constraint that any fields that
only optional fields (per schema) may be removed by filtering. Commented [mwh83]: This seems excessively constraining.
Do we really want this?
Projected View
This is a view constructed from one or more natural views, and projected onto a different view. This view would
typically be described by a different schema. Joined views are included in this definition.

Pre-configured View
This is a filtered or projected view that is defined by the service owner (either by code or by configuration).

Copyright © 2008 Amazon Global Services. All rights reserved. | 24


Custom View Commented [mwh84]: This bullet list seems to cover 2 or 3
This is a filtered or projected view whose definition is itself considered a modifiable resource. (somewhat) orthogonal design dimensions. Kind of view != how
that view is defined != how that view is named.
Commented [mwh85]: Always ?
9.29.5 Observations
Mutability Formatted: Heading 4
Although it may be permitted (but not required) to allow a “PUT” statement to the primary resource, performing a
PUT operation on a projected resource will in most cases be invalid. Particularly as properties in the projection
may be computed, and/or properties may be pulled from multiple resources and these resources cannot be
updated together atomically.

Projections Formatted: Heading 4


Given that a projection may be a cross-join of multiple resources, its resource identifier and resource namespace
may be considered orthogonal to the resources for the primary view. The keys for a resource may coincidentally or
conveniently have overlap.

Versioning Formatted: Heading 4


To some extent, support for older schemas could be implemented as a projection. However, there is a practical
requirement to be able to PUT using a previous version schema. At the same time, given that versions of a view
have a functional relationship, then a PUT using a previous version schema can be translated into an equivalent
PUT for the current version schema.

Filter Formatted: Heading 4


Filtering only has meaning during a GET operation, however it is clearly the same resource. This filtering may be
explicitly applied (for example, “we only desire the title property of a catalog item”), or implicitly applied (for
example, “do not return Target specific attributes for an item contribution if the request did not originate from
Target”).

Custom Views Formatted: Heading 4


Custom views are themselves resources that are subject to CRUD operations.

9.39.6 Amazon REST Approach Commented [ay86]: The title of this section is a bit too generic.

Schema Versioning Commented [mwh87]: I cover this in section 8.3 with a


When a resource is retrieved (GET), its resource schema should be negotiated. The established mechanism for different approach. Agree on my approach? Delete this section?
Talk more?
doing this on HTTP is using the “ACCEPT” header.1
Formatted: Heading 4
When a resource is updated (PUT), the version of the schema that the PUT is based upon must be specified. If a Commented [mwh88]: Alternatively, SDL has (or will have)
PUT is not allowed using the given schema version, it must result in a HTTP error. some conventions for attaching a type name to a data value.

(Note: Is there any pre-existing means/header to specify the schema for PUT? Maybe mime type?)

Filter Formatted: Heading 4


Where a resource can be retrieved (GET), with a given schema (per Schema versioning), it potentially can be
filtered implicitly or explicitly. The resulting output MUST conform to the resources schema. Commented [ay89]: I think we may want to flip how we say
about this requirement.

When a query is performed on a resource, a resource schema


1 SHOULD be generated by the service, based on the underlying
We need to establish the convention for this
resource schemas and query definition.

Copyright © 2008 Amazon Global Services. All rights reserved. | 25


For example, if the output is in Ion format, and the schema for the response declares properties as required, those
properties must not be filtered out thereby breaking the schema contract.

The filter MAY be specified by application defined2 query parameters. For example, PAS defines a query parameter
“Facets” to allow a list of filters to be passed.

The filter MUST NOT be specified during PUT operations3. Commented [ay90]: Reconcile with “Mutability” paragraph
above.
The filter MAY have the schema version specified. That is, the filter is applied to a specific version of a schema. Commented [ay91]: These two statement is a bit under
specified. An example would be good.
Projection Formatted: Heading 4
Projection is defined as any/all of the following cases:

 The view is an alternative composition of the natural view that cannot be described as a different (older,
newer) version of the natural view schema and/or does not have a simple functional relationship.

o This view may use the same primary key as the natural view.

o This view may contain multiple repeated instances of a single property from the natural view.

o This view may omit properties from the natural view.

o This view may contain computation results based on the natural view.

 The view is a composition of multiple natural views in a join/union relationship.

o This view may (but need not) use a different primary key.

A projection MUST use a different namespace to the original resource. For example, if the original resource is
/a/b/c/<primary-key> then the projected view MUST NOT use the resource name /a/b/c/<primary-key> for its
projection or use this as the prefix of its projection.

The projection MAY be a different service whose purpose is to provide projections to the underlying resource. 4

Each different projection of a resource MUST use a different resource name to any other projection of a resource.5

A projection MUST follow the naming conventions to define the projection’s schema.

An approach to this could be:

2
Should this be application defined? Can we be more specific? PAS uses the query parameter “Facets” to list a
series of top level fields/detail levels that results in fields being excluded. PAS itself returns a projection as part of
its service definition. This filter does not modify the projection, but rather filters it.

3
Should any query parameters ever be specified?

4
E.g. PAS

5
That is, should you identify projection A of a resource, then projection B can be considered a projection of A as
well as a projection of the original resource, and follow the same rules.

Copyright © 2008 Amazon Global Services. All rights reserved. | 26


/projection/<projection-name>/a/b/c/<projection-primary-key>

Custom Filters/Projections Formatted: Heading 4


Filters and Projections are themselves resources. Filters and Projections SHOULD use different resource
namespaces to each other. Projections MAY provide the base name of a projected resource.

E.g. Given the resource:

/projection/<projection-name>/a/b/c/<projection-primary-key>

Then it may be desirable to access the underlying projection as the following resource:

/projection/<projection-name>

Note: should this be a suggestion, or should this be a requirement?

Note: should we leave defining the format of the projection outside of scope of this working group?

10 Extended Use-Cases

10.1 Batching
Batching is a common technique for improving implementation efficiency and reducing latency for a series of
invocations in request/reply interactions. Batching involves sending multiple invocation requests in a single HTTP
request, and receiving the results of those invocations in the single corresponding HTTP response.

A client sends a batch request to a URL that supports batching. The URL will typically be the top-level name of the Commented [mwh92]: Did I say that? Do we agree on this?
target service. The HTTP method MUST be BATCH. If it is inconvenient to or impossible to use “BATCH” for the
HTTP method, the client can use POST with the “x-http-method-override: BATCH” header.

The body of the HTTP request MUST be an array sequence of structures individual complete HTTP request
messages that represent the individual requests. The body of the HTTP response MUST be an arraya sequence of
structures complete HTTP response messages that represent the individual responses, in the same order as their
corresponding requests. The HTTP response code of the enclosing response message should indicate success if the
batch is processed at all by the service. The status of individual requests are returned in the contained individual
replies. The URLs of the enclosed requests SHOULD refer to the same service that the batch is directed to. The Commented [mwh93]: MUST ?
content type of a batch HTTP request or reply MUST be “application/octet-string”, since the enclosed messages
can contain arbitrary data. The content-length of a batch HTTP request or reply MUST be the sum of the total
length (i.e. including headers) of the individual messages.

The content type of a batch HTTP request MUST be one of “text/json”, “text/x-amzn-ion”, or “application/x-amzn/-
ion”. The following combinations of batch request content type and embedded request content type are
permitted:

Copyright © 2008 Amazon Global Services. All rights reserved. | 27


Batch Request Content Type Embedded Request Content Type Embedded Request Body Format Formatted Table

JSON JSON JSON value

JSON Other Text JSON string

JSON Other Binary JSON string, base64 encoding of binary

Ion JSON / Ion JSON / Ion value

Ion Other Text Ion String

Ion Other Binary Ion BLOB

The individual requests MUST contain "url", "method", and "headers" fields and MAY contain a "body" field. The
individual responses MUST contain "status", "reason", and "headers" fields and MAY contain a "body" field. The
URL of every individual request MUST be a relative URI. That URI is appended to the URI of the HTTP request.

The individual requests do not need to share any characteristics; they are completely independent. No individual
request headers are inherited from the main HTTP request.

The individual requests in a batch can be executed by the service in any order, or in parallel. The batch mechanism
does not imply any kind of transactional or all-or-nothing semantics.

An example batch request:

BATCH http://someHost:1234/MyService
Content-Type: text/jsonaplication/octet-string
Content-Length: ...

PUT http://someHost:1234/MyService/order/5678 Formatted: Default Paragraph Font, Font: Courier New, 10


Content-Type: text/json pt, Not Bold
Content-Length: ...

{ "field1": "data1", ... }


GET http://someHost:1234/MyService/order/5678
Accept: text/json
Content-Length: 0

[
{
"url": "/order/5678",
"method": "PUT",
"headers": [ "content-type: text/json", ... ],
"body": { "field1": "data1", ... }
},
{
"url": "/order/5678",
"method": "GET",
"headers": [ "accept: text/json", ... ]
},
...
]

The corresponding reply would be:

Copyright © 2008 Amazon Global Services. All rights reserved. | 28


HTTP/1.0 200 OK
Date: ...
Content-Type: text/jsonapplication/octet-string
Content-Length: ...

HTTP/1.0 200 OK
...

HTTP/1.0 500 Internal Server Error


...
[
{
"status": 200,
"reason": "OK",
"headers": [ ... ]
}
},
{
"status": 500,
"reason": "Internal Server Error",
"headers": [ ... ]
},
...
]
Formatted: Code Sample, Indent: Left: 0.5"
This form of batching is entirely at the protocol level. It is not expressed as part of a service’s interface.

There are two other techniques that can also be used to address the same issues.

The client can also open several connections to the service and execute calls in parallel. One disadvantage of this
is increased resource utilization on both the client and server (i.e. sockets). Another disadvantage is the increased
complexity of managing multiple connections, although this might be reduced by library code. In fact, with a
sufficiently sophisticated client, the batching protocol above could be automatically transformed into parallel calls
using multiple connections.

Streaming of HTTP requests would have been the preferred alternative over both batching and multiple
connections, but it is not well supported by existing tools and libraries. Supporting streaming becomes especially
problematic with intermediate caches and proxies, basically any process that could divide the stream of requests
to multiple destinations. Most such intermediate components simply do not support HTTP streaming.

10.2 Partial Resource Updates


By partial update I am referring to one of two forms of partial update.

The first is a PUT (or POST) to the resource URL where the body contains only a portion of the actual resource. For
example POSTing just the zip code. The idea is that only the data in the body would be applied to the identified
resource. (the resource identified by the URL). In reality this is the sort update most merchants apply on item
data, but the pipeline has a fair amount of code, policy, and metadata to drive the reconciliation of the partial
update and the full existing item.

The second form allows PUTting (typically) to a URL that "drills into" the resource. For example PUT to
.../customer/123/address/zip_code, and again where the body contains only the zip code value. This is very
appealing as it treats the hierarchy of the resource as if this were the hierarchy of a file system.

A big problem with both of these is that you encounter significant coordination issues in a real environment. How
do you handle concurrent updates? When the updates are in conflict - say two requests come in to update the zip Commented [mwh94]: At the time I was reading through this
section, it felt a bit “chatty”. The reader isn’t going to answer us
here, nor will we hear it if they do.

Copyright © 2008 Amazon Global Services. All rights reserved. | 29


code. What happens when two updates don't (appear to) conflict - say one to change the last name and a
different request to change the first name.

We allow POST and we can offer guidance on how to handle smaller focused updates. But we should make that
more work. In particular we should force the service owner who wishes to offer partial update to fully define the
behavior in the face of concurrent updates.

Service owners are certainly welcome to expose resources which are partial views of some data - such as just the
contact information about a customer. And these could quite reasonably be updatable. And this should be done
with careful consideration.

Service owners should be encouraged to offer some form of partial update as an explicit extended operation.
(ouch ouch, that sounds like RPC) Again careful consideration is required. As an example our Merchants updates
are really the Merchant submitting (PUTting) a contribution to the item. The PUT of this sort of resource triggers a
workflow that processes the contribution and adds the Merchants properties to the item (or not if we already have
better data, for our definition of better).

10.3 “RPC Operation” Using a REST Interface


(Last Modified: Jan 13, 2009 - by Alex Yiu)

Reader/Editor Note:

(1) RFC’s keywords (such as “MUST”, “SHOULD”, “MAY”) are used in this document. They have normative
implications. Commented [mwh95]: Move this note to the top somewhere
?
(2) In this proposal, we assume that there is a URL to point the service itself. E.g.:
http://someHost:1234/myFooService .
We also assume there is a URL to point to a resource maintained by the service. E.g.:
http://someHost:1234/myFooService/pathToResourceX .
The URL (physical) vs URI (logical) usage consideration would be addressed in a separate section of the
document.

Limitation of Standard HTTP-Method


A REST interface is designed to leverage the semantics of standard HTTP Methods (e.g. “GET”, “PUT” and etc).
Those methods are typically used to read or write the state of a data-centric resource.

However, a general business services might have other operations which that do not fit into this paradigm. Those
operations are typically related to performing an action or an algorithm. Those operations might be viewed as a
resource itself. To make this document more concise, “algorithmic resource” or “action resource” will be used as Commented [mwh96]: I’m struggling with this here. We
the terminologies to describe such operations thereafter. already have “algorithmic resource” (a noun) covered way up top.
This section mixes the nouns with the verbs, and calls it all RPC.
Should we limit RPC to be just verbs? And then the distinguishing
Some of these operations are not tied to a particular data-centric resource. For example, consider a service which characteristics is that “operation” is specified somewhere (only in
manages orders as its data resource. This service might have a currency conversion operation (algorithm) which is the query parameters? Never in the uri?).

not tied to a particular order document. On the other hand, some of these operations can be based on a particular
data resource instance. The same example ordering related service might have an operation to calculate the

Copyright © 2008 Amazon Global Services. All rights reserved. | 30


shipping cost based on the ordered items and shipping option. It might also have another operation (action) to
finalize the ordering document and submit it to the fulfillment part of the business.

Reader/Editor Note:

[Merged with the existing doc related preamble]

Invoking an Algorithmic Operation


HTTP POST method MUST be used to invoke an Algorithmic Operation. The target URL of the POST request MUST
refer to either a Resource or a Service followed by an operation identification segment. (e.g.
“;operation=myFooOperation”)

The operation name MUST contain alphanumeric characters and underscore only, It MUST start with an alphabet
character.

The regular expression for a valid operation name is: [a-zA-Z][a-zA-Z0-9_]* Commented [mwh97]: Should we define “identifier”
somewhere more global?
Reader/Editor Note:

An alternative design is: putting the operation name as HTTP header (e.g. “x-amzn-operation”)

Reasons to use suggest headers instead of URL to denote the operation name are: we have not nailed down
details of resource-naming scheme and query mechanism yet; using header would allow more flexibility to
innovate in that space.

Resource-based Method
When an Algorithmic Operation requires a Resource to complete, the target URL of the POST request SHOULD
refer to the corresponding Resource instance and followed by the operation identification segment. An example of
this kind of operation is to calculate the shipping cost of an order. An example of the URL for the POST request
refers to that particular order is:
http://someHost:1234/myOrderService/pathToOrder5678;operation=sumbitOrder

Service-based Method
When a Non-Standard-Resource Operation does not require a resource to complete, the target URL of the POST
request SHOULD refer to the corresponding Service and followed by the operation identification segment. An
example of this kind of operation is to convert currency. An example of the target URL is:
http://someHost:1234/myOrderService;operation=convertCurrency
Formatted: Font: 8 pt, No underline, Font color: Auto
An example of a HTTP Request is:
Formatted: Code Sample, Indent: Left: 0.5"
POST http://someHost:1234/MmyOrderService;operation=convertCurrency Formatted: Font: 8 pt
Content-Type: ...
Content-Length: ... Formatted: Font: 8 pt, No underline, Font color: Auto
{
"convertCurrencyRequest" : { Formatted: Font: 8 pt, No underline, Font color: Auto
"currencyCode": "USD", Formatted: Font: 8 pt, No underline, Font color: Auto
"currenyAmt": 56.00,
"targetCurrencyCode": "EUR" Formatted: Font: 8 pt, No underline, Font color: Auto
Formatted: Font: 8 pt, No underline, Font color: Auto

Copyright © 2008 Amazon Global Services. All rights reserved. | 31


}
}
Formatted: Font: 8 pt
The above HTTP POST request is equivalent to the following pseudo method signature:

Double convertCurrency( Formatted: Font: 8 pt, No underline, Font color: Auto


String currencyCode,
Double currencyAmt, Formatted: Font: 8 pt
String targetCurrencyCode
) Formatted: Code Sample, Indent: Left: 0.5"
Formatted: Font: 8 pt, No underline, Font color: Auto
Formatted: Font: 8 pt
OK Responses
Formatted: Font: 8 pt, No underline, Font color: Auto
HTTP status code: [TODO: to merge this section with generic status code description]
Formatted: Font: 8 pt
 2XX, if the operation is invoked successfully Formatted: Font: 8 pt, No underline, Font color: Auto
Formatted: Font: 8 pt
If the return type of an operation is “void” equivalent, a response of an empty response body (of which “Content-
Formatted: Font: 8 pt, No underline, Font color: Auto
Length” is zero) SHOULD be used. In such cases, “Content-Type” header MAY be left unspecified.
Formatted: Font: 8 pt

Error Responses
HTTP status code: [TODO: to determine what numeric values should be used for “XX” and to determine whether
we can add some extra system level error here; to merge with this section with generic status code description]

 4XX, if a service is not found

 4XX, if a resource is not found

 4XX, if an operation is not found

 5XX-5XX, if a system level error occurs (e.g. parsing error of input payload)

 5XX-5XX, if a business level error occurs (e.g. an input argument is out of range as specified by the
business logic

The HTTP Response Body MAY contain details of an error, which are expressed in one of data formats (such as, Ion,
JSON and XML) supported by this specification. In such a case, “Content-Type” MUST be set accordingly.

10.4 Events / Notifications via HTTP

Certain client<->server interactions are event-driven, meaning that the information that the client is interested in
will arrive at an undetermined time in the future. For instance, an AJAX application that implements instant
messaging is waiting for the arrival of new instant messages on behalf of the user.

In this case, clients should initiate the connection to the server using a GET request and wait a set time threshold in
the range of 30seconds-5minutes before terminating the existing request and initiating a new request. Clients
should be using persistent connections to avoid unnecessary TCP startup costs.

Servers should accept the request and keep it open for a reasonable amount of time. If no message arrives the

Copyright © 2008 Amazon Global Services. All rights reserved. | 32


serverIf the server has no event to send within the specified time limit, or if the server must return early for some
other reason the server should reply with status "204 No Content". If the requested information does become
available the server should return that content with status "200 OK". Servers which support interfaces of this
nature should take care to minimize the captive resources used to support an open connection, for instance by
using Servlet 3.0 suspendable requests.

10.5 Workflows
I believe we will need to, and I have not yet, separated automated workflows from those that contain 1 (or
more) HIT's (tasks that a human is responsible for). For the moment I am "ignoring" those workflows (and
workflow systems) that have activities which humans perform.

A workflow is a requested operation that make may take a long time and is processed asynchronously from the
request that starts it. This is distinct from a typical REST method which operates essentially synchronously that is
the request is complete when the result is returned. A request that is handled asynchronously is an intermediate
form. If the operation can be called synchronously then it is not a workflow (even if it is handled by a workflow).

A PUT or POST request is used to start a workflow. PUT if user knows ID, POST if not.

Workflows generally, but not universally, start with an "insert". Dropping off a request to do work. Examples of
this include merchant feeds where an entire file is dropped off for processing, item data where a merchant SKU
keyed datagram is put to the "front end" of the item pipeline, and the buy button where a shopping cart starts the
process.

The result of this call will include, in the body, a token that can be used to reference the state of the workflow.
This MUST include the status of at least IN-PROGRESS, FAIL, SUCCESS, PARTIAL-SUCCESS. It may include a progress
indicator. It should (always, but not MUST) return appropriate error information when failures (full or partial)
occur. These MUST include any identifier to the entity associated with the failure, the failure code (for example
the exception), and any additional information necessary for the caller to take appropriate action.

10.6 Extending HTTP Methods Commented [mwh98]: Can this just go away ?

Services may offer extended methods. These should be either included in the HTTP header <put exact header
here> or access through batch.

A notable example is partial update of a resource. Perhaps as the method MERGE <any other suggestions for this?
Again it is common is should be constant.>.

Another example is the "factory" methods - append, where the caller doesn't know the key, the service assigns it.
Or data conversion, such as upload a bmp but the resource is the equivalent jpg. These patterns need examples
and naming conventions.

Copyright © 2008 Amazon Global Services. All rights reserved. | 33


11 APPENDIX A: Glossary

[Prepared by Alex Yiu; Last Modified: Jan 29, 2009 – 1am]


[Note: We are considering replacing “Data Item” with “Data Object” / “Data Value” globally in this doc.]

Data item - A Data Item is a unit of data, which is either a scalar value (e.g. an integer or a string) or a container
which holds zero or more data items (e.g. an IonStruct or a JSON Array).

Structure - A Data Item that has a collection of named fields. Examples are an IonStruct, a JSON Object and an XML
Element

Sequence (of Data Items) - A Sequence of Date Items is an ordered collection of zero or more Data Items. The size Commented [mwh99]: Standardize on “list” ?
of a Sequence MAY be unknown. A Sequence is an abstract data model concept for REST computing (particularly
REST-Query related). A Sequence might be represented by an on-the-wire format (e.g. a JSON array) in some cases,
while another Sequence might not be represented effectively or correctly by an on-the-wire format in some other
cases (e.g. a JSON array would not be able to represent a Sequence of unknown size effectively). Sequences never
contain other sequences. If Sequences are combined, the result is always a “flattened” Sequence. This flattened Commented [mwh100]: No. No need to imply that lists can’t
nature is to simplify the data model for REST computing. [Note: a JSON array or an Ion List is NOT identical to a contain other lists. “combine” is semantically imprecise.

Sequence.]

Resource - A Resource is a data oriented service component made available to users to perform actions on and it
MUST be always identified by a URL. Possible actions to a resource are retrieval, mutation or invocation. Actions
are performed through one of the HTTP methods, e.g. GET/PUT/DELETE/POST. During mutation and invocation,
the HTTP Request Body MAY have zero or one Data Item or a Sequence of Data Items as its input. During retrieval,
mutation and invocation, the HTTP Response Body MAY have zero or one Data Item or a Sequence of Data Items as
its output. Resources can be further divided into different sub-categories: entity resources, algorithmic resources,
query resources.

REST Service - A REST Service is a service that manages a collection of resources through HTTP protocols and MUST
be always identified by a URL (a.k.a. Service URL).

An example of a Service URL is: http://host:port/MyFooService

Entity Resource - An Entity Resource is a resource: 1) that accepts HTTP GET/PUT/DELETE methods; 2) that
represents a state which lives beyond the duration of the HTTP methods. The state of an Entity Resource is
represented by a single Structure or a Sequence of Structures. The state of an Entity Resource can be retrieved by
HTTP GET and can be mutated by HTTP PUT/DELETE. While some resources return a Sequence of Data Items, the
ordering of Data Items MAY be unspecified in some cases. [Open Issue: An Entity Resource MAY accept POST for
other operation purposes, which may be non-idempotent, such as partial update operations.] An Entity Resource
URL is formed by adding one or more non-empty relative URL path-segments to the Service URL.

Examples of Entity Resource URLs are: http://host:port/MyFooService/US/Electronics/A1234765 and


http://host:port/MyFooService/US/Electronics
Individual Entity Resource - an entity resource which represents a single Structure (i.e. not a Sequence of
Structures) [Open Issue: Can a scalar value (without any structure around it) be an Entity Resource itself?]

Copyright © 2008 Amazon Global Services. All rights reserved. | 34


Entity Resource Collection –a Sequence of zero or more Structures; it is also an Entity Resource itself, which Commented [mwh101]: Above, I leave off “entity resource”.
implies an Entity Resource Collection MAY accept GET/PUT/DELETE methods [Open Issue: Should we allow A collection is just a collection.

“http://host:port/MyFooService/US/Electronics” pointing to a collection of resources at a service’s discretion?


Should we allow that collection potentially updateable?]

Entity Resource Hierarchy - Individual Entity Resources managed by a REST service SHOULD be organized in a Commented [mwh102]: If we mean this, the topic should be
hierarchy of Collections (i.e. Entity Resource Collections) at the service designer’s discretion. All Individual Entity discussed above. This section isn’t a “definition” for a glossary.

Resources in a Collection MUST share the same scalar value of a particular named field. All Individual Entity
Resources in a Collection (i.e. the parent Collection) can be further divided and organized into child-Collections
(hence, the notion of hierarchy). Individual Entity Resources in a child Collection will share another scalar value of
an additional named field.

Each Collection in the hierarchy is identified by a URL. The URL is formed by adding path-segments to the Service
URL. Each path-segment represents a shared scalar value in a collection in the hierarchy. The ordering of the path-
segments added to the URL corresponds to the hierarchical Collection ordering, from parent to child.

A REST Service has discretion to reject HTTP requests to an Entity Resource Collection. An example scenario is: a
HTTP GET request to an Entity Resource Collection is rejected because the Collection is too expensive to compute.

[Open Issue: From Chris: does the shared value have to be a scalar? From Alex: gut feelings says: we can expand to
non-scalar data object, as long as the service provides a clear “toString()” and “equal()”semantics on those non-
scalar data object.]

When the URL refers to a collection of Entity Resources, the trailing “/” is insignificant to achieve web browser
friendly behavior. For example, these URLs are identical in the context of Entity Resources:
http://host:port/MyFooService/US/Electronics/ and http://host:port/MyFooService/US/Electronics

Algorithmic Resource - An Algorithm Resource is a resource that accepts HTTP POST and GET methods only. The
URL of an Algorithmic Resource is formed by adding the name of algorithm (a.k.a. operation) as a parameter to an
Entity Resource URL or a Service URL. HTTP GET SHOULD be used only when the operation does not create any
user visible side effects (which also implies the operation is idempotent).

Query Resource – A Query Resource is a resource where a Query is applied to a resource as its input. A Query Commented [mwh103]: I didn’t distinguish this above. IMHO
Resource can be used an input to another Query Resource. A Query Resource itself is READ-ONLY. A Query “collection” covers it, and “can throw a query at it” isn’t quite as
useful. Or not.
Resource accepts HTTP GET only if the input resource accepts HTTP GET method (e.g. an Entity Resource).
Alternatively, it accepts HTTP POST method only if the input resource accepts HTTP POST method (e.g. an
algorithmic resource). A Service MAY reject requests to a Query, if it deems the Query is too expensive to
compute. [Open Issue: URL Syntax TBD.]

Query - …

11.1 Normative References:


 Uniform Resource Identifier (URI): Generic Syntax - http://www.ietf.org/rfc/rfc3986.txt

Copyright © 2008 Amazon Global Services. All rights reserved. | 35


 The application/json Media Type for JavaScript Object Notation (JSON) -
http://www.ietf.org/rfc/rfc4627.txt

11.2 Non-Normative References:


 XQuery 1.0 and XPath 2.0 Data Model (XDM) - http://www.w3.org/TR/xpath-datamodel/

 HTTP User’s Guide -


https://portal.ant.amazon.com/sites/EPPG_HQ/active_programs/Architecture/Shared%20Documents/W
orking%20Group%20--%20Distributed%20Computing%20Interfaces/HTTPRest.doc

OLD TEXT

12 APPENDIX: Old Notes

12.1 the resource URL


sample use cases:

FILE UPLOAD

File upload is a common operation in that many Amazon services take large files in from their users. Examples
include the Digital team getting original digital content and Merchant services accepting feeds with Item data.
When the inbound file is itself a resource, that is something you can GET and one that the submitter knows the
identifying key, PUT is the appropriate method. In other cases this would be handled using a POST, with a suitable
alternate method.

NOTE: partial update is an example of an operation that needs its own method. Should we define this? It's not
"POST".

a simple resource

The basic use case is that the developer wants to make the management of a single resource available to a "public"
audience. While this is nearly a "toy" example it is the basic foundation and we have a large number of examples
of this.

item front end

<tbd>

Copyright © 2008 Amazon Global Services. All rights reserved. | 36


starting and "watching" a work flow

<tbd>

publishing a report

<tbd>

feature decisions (how to use http for amzn rest)

 extended methods
o will we have them? - yes
o how to handle them - tbd, current proposal is through “programmatic resources”
 metadata access - what to offer (or require), where it lives
o schema language for shape - SDL schema
o JSON vs Ion vs XML vs BSF – JSON for public, Ion internal
o language for API – SDL
o what about other policies, like security, sla's, etc
 URL use - keys, query parameter, other tokens
o sub document addressing (continued from last week)
o programmatic resources
 query parameter use
 query in general
 http header use
o especially content type (et al)
o security
o context tracking
o etags
o cache control
o (can we redirect a POST to a PUT?)
 cookies (or should they get their own line item?)
 error code use
 API definition on client (in general)
 API definition on server (in general)

FAQ (alpha - I'll add some answers shortly):


What is REST?

What is a resource?

Copyright © 2008 Amazon Global Services. All rights reserved. | 37


REST is pretty fuzzy, which flavor of REST do you mean?

My service interface doesn't have any resources, how could I possibly use REST to expose it?

If I'm updating information why shouldn't I use GET to do it?

If my operation isn't a query don't I use POST for it?

My browser (or the one I need to support) doesn't support PUT, what do I do now?

How do you encode the resource key in the URL?

How do I support batches?

How do I support transactions?

How do I describe my service?

What do I use to describe my resources?

When things change do I have do anything special, or just change my service?

How do I protect myself from services that keep changing?

How do I do security?

What code to I use on the client?

What code do I use to get this into my service?

Where can I go for more information?

12.2 Sample interactions from our white board discussions:


Request:

PUT to IMSv2/Contributions/ACME/DOLL

Document:

Item:: {
merchant:ACME,
sku:DOLL,
listing::{ … },
product:: {
description:”a doll”,

},
}
Returns ok / accepted

Returns request id / or URI

Copyright © 2008 Amazon Global Services. All rights reserved. | 38


Can have side effects (e.g. starting a workflow)

MUST be able to perform GET on IMSv2/Contributions/ACME/DOLL – returns status or result MUST be idempotent
– i.e. PUT to a user known key, PUTable resources are a strict subset of GETable resources.

All side effects are idempotent.

Resources are(?) (should be?) versioned, all side-effects are(?) versioned.

POST- eg

POST to /Feeds/SFF/ACME

Same document as above, returns back id for submitted item.

Can have alternative ways of retrieving data, e.g. retrieving side-effect data such as ASIN

RCAT_OFFERS/mk/asin

Returns document for ASIN

12.3 Questions:
For each arc, is it a get? Put? Post? How to decide?

How to do async?

How to do transactional workflow?

Paging? (Start? Count?)

When returning a list of items – JSON – would use an array to represent the list (JavaScript constraints)

What is max URL size limit? 2K? (Need to verify)

Query parameters canonicalization?

Steps

• Step1 – merchant submits to feed, creates contribution, ASIN, listing

o Find items ** query

o Get listings

o Choose listing

• Step2 – add to cart ** workflow

Copyright © 2008 Amazon Global Services. All rights reserved. | 39


• Step3 – check out “yes, I’ll pay” ** transacted

o Check prices

• Step4 – order processing ** observable workflow

• Check availability – yes – no – later

• Reduce Availability

• Fulfillment confirmation

• Charge visa

• Generate pick list

Find Items:

POST?

GET? /ItemSearch/<MARKETPLACE>/
?
BrowseNode=137;
Keyword=doll;
Keyword=new;
Returns back ASIN’s/ other data

{

asins:[Bxxxxxxxxx, Byy… ]
}
Or

Header

{
some status
{ asin:Bxxxxxxxxx}
{ asin:Byy…}
}
???

Cacheability - What about with customerID?

Less interesting, but not intrinsically wrong.

GET vs POST ?

• If 20% or more breaks the 2k limit? Do you use POST instead?

Copyright © 2008 Amazon Global Services. All rights reserved. | 40


• GET is more cacheable

o Header must include cache-ability parameters as Proxies/caches will (may) consider GET with
query parameters uncacheable.

• POST representing GET as an extreme is accepted

Order independence?

• Should they be sorted for canonicalization?

12.4 Misc Notes:

As we describe how resources are named, (LHS and RHS), we should consider using the terminology from RFC3986,
“URI Generic Syntax”:

Abstract

A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or
physical resource. This specification defines the generic URI syntax and a process for resolving URI
references that might be in relative form, along with guidelines and security considerations for the use of
URIs on the Internet. The URI syntax defines a grammar that is a superset of all valid URIs, allowing an
implementation to parse the common components of a URI reference without knowing the scheme-
specific requirements of every possible identifier. This specification does not define a generative
grammar for URIs; that task is performed by the individual specifications of each URI scheme.

Some more data for the RPA use case (which will influence PAS REST interface):

Some of the parameters for RPA are an obvious choice to qualify the resource (e.g. MarketplaceId, MerchantId,
ASIN)

Some parameters control the subset of data that is to be returned (e.g. prefetch list / flavors in RPA, "facets" in
PAS)

Some are data qualifiers / obvious query parameters (e.g. "PreferMerchantImages", "CustomerIsPrime",
"UseFMAv3")

Copyright © 2008 Amazon Global Services. All rights reserved. | 41


The traditional form of RPA batching takes a collection of requests (e.g. MarketplaceId, MerchantId, ASIN,
PreferMerchantImages all specified), allowing RPA to parse the list in parallel, and then returning a collection of
results with a 1:1 mapping of the results to the requests.

If a batch of 20 ASIN's are provided, then this allows RPA to perform, e.g., 40 service calls as opposed to 800
service calls.

Nothing in the RPA (or PAS) use case requires/desires transactions.

12.5 user resources:

These are the entities your service understands, and usually controls. They have a name and some form of
identifier. The URL of a resource must include the service name, the resource name and all the key parts needed to
identify the resource.

resource name - this should be an informative, reasonably short symbol.

key parts - the resource is identified by a key. The key may have one or more logical fields. Marketplace and ASIN
are an example of a two part key. Each of the key parts should

The key parts mimic a directory tree. As should consideration to their order is important. It may also be useful to
support a specialized form of query through partial key specification. This would return the sub-tree, subject to
any filters specified in using URL query parameters. [cas: this should be the only use of query parameters.]

Copyright © 2008 Amazon Global Services. All rights reserved. | 42

You might also like