CloudComputing DATABASE
CloudComputing DATABASE
Overview:
This unit provides an overview of the types of data stores that are used in cloud computing. You will also learn about the data
services offerings that are available through the cloud development platform.
Data is defined as a set of facts, statistics, or figures. It can be in many formats, such as text documents, images, audio, or
videos.
Raw data, which is the most basic format of data, is processed to produce useful information. Data processing and analysis
helps modern organizations increase their productivity and make better business decisions.
Data can be categorized into two main categories: structured data and unstructured data. Structured data is the
formatted and highly organized data that can fit easily into data models with fixed fields. An example is a list of students or
employees data, including their names, ages, and addresses. Unstructured data is the opposite of the structured data. It is
unorganized, raw and has no formal structure and it is considered as loosely structured data. For example unstructured text
and multimedia like email messages, webpages, documents, photos, audio files and videos.
There is a popular saying that data is the new oil because the data and the information that is obtained by
processing the data play an important role in modern organizations and contribute to the development of new
business models. The organizations that are considered the most successful ones are those that can capture, manage, and
derive key insights from their corporate data. Cloud technologies enable small organizations to design, set up data platforms,
and use data analysis services on the cloud quickly and receive benefits from the scalability, reliability, and quality of service
that is provided by the cloud. These factors help these organizations to evolve quickly and grow up faster in the market.
• Graph databases:
▪ Compose for JanusGraph (Beta)
JanusGraph is scalable graph database that is optimized for storing and querying highly interconnected data that is
modeled as millions or billions of vertices and edges.
IBM Cloudant
Cloudant is an IBM software product, which is primarily delivered as a cloud-based service. Cloudant is a
non-relational, distributed database service of the same name. Cloudant is based on the Apache-
backed CouchDB project and the open source BigCouch project.
IBM acquired Cloudant, a Boston-based cloud database startup, in 2014.
IBM Cloudant is a NoSQL database as a service (DBaaS) that is optimized for handling heavy workloads of
concurrent reads and writes in the cloud. These workloads are typical for large, fast-growing web and mobile
apps. It is built to scale globally, run continuously, and handle various data types, such as JSON, full-text,
and geospatial.
Cloudant ensures that the flow of data between an application and its database remains uninterrupted and
performs to the users’ satisfaction. The data replication technology also allows developers to put data closer
to where their applications need it most.
Cloudant frees developers from worrying about managing the database, which enables them to focus on the
application. Cloudant eliminates the risk, cost, and distractions of database scalability, which enables
you to regain valuable time and your applications to scale larger and remain consistently available to users
worldwide.
Data is stored and sent in JSON format. The data documents are accessed with a simple REST-based
HTTP method. Anything that is encoded into JSON can be stored as a document.
Documents in Cloudant:
Cloudant documents are containers for data, and the documents are JSON objects. All documents in Cloudant must contain
the following unique fields:
• An identifier _id field serves as the document key. It can be created by the application or generated automatically by
Cloudant.
• A revision number _rev field is automatically generated and used internally by the Cloudant database as a revision
number. A revision number is added to your documents by the server when you insert or modify them. You must specify the
latest _rev when a document is updated or your request fails. It also helps avoid conflicting data states.
Cloudant Dashboard
Cloudant Dashboard:
Cloudant Dashboard is a cloud-based web interface that makes it easy to develop, administer, and monitor
your databases. You can perform many tasks, such as:
• View and manage Cloudant databases.
• View and create documents.
• Create and run queries.
• Manage the permissions to the database.
• View capacity usage (reads/second, write/second, storage limit, and so on).
• Manage the plan settings (upgrade plan, raise throughput capacity, and so on).
You can also display the contents of a Cloudant document in IBM Cloud by selecting the database. Then,
select All Documents to display the list of documents. You can edit each of the documents in the list
to display or modify the document contents.
Cloudant HTTP API:
Cloudant uses an HTTP API to provide simple, web-based access to data in the Cloudant data store. The HTTP API is a
programmatic way of accessing the data from your applications. It provides several HTTP access methods for data read,
add, update, and delete functions.
The following HTTP Request methods can be used to apply the create, read, update, and delete operations on Cloudant
documents by directly referencing the document ID:
• GET: Request a specific JSON document.
• POST: Set values, and create documents.
• PUT: Create databases and documents.
• DELETE: Delete a specific document.
To create a document, you can send a POST request to https://$USERNAME.cloudant.com/$DATABASE with the
document's JSON content in the request body.
To update (or create) a document, you can send a PUT request to
https://$USERNAME.cloudant.com/$DATABASE/$DOCUMENT_ID with the updated JSON content, including the latest
_rev value in the request body.
To delete a document, you can send a DELETE request to
https://$USERNAME.cloudant.com/$DATABASE/$DOCUMENT_ID?rev=$REV where $REV is the document's latest _rev.
Cloudant indexes:
A database index is a sorted data structure that enables quick access to a portion of the data. By default, IBM Cloudant
generates a primary index for the _id field so that it can retrieve data by _id.
A user can create secondary indexes for other fields if there are many queries that run on these fields.
After you create an index, a design document is generated on Cloudant to describe the index that is created. Design
documents are used to build indexes, validate updates, and format query results.
The index type is either text or JSON. Text indexes are powered by Cloudant search indexes, which enable you to query a
database by using Lucene Query Parser. JSON indexes are powered by MapReduce.
Cloudant Query:
Before you query for a specific field, it is a best practice to create an index for each field in the selector to optimize query
performance.
The JSON body that is provided shows an example of a Cloudant query request body. In this example, the response of this
request returns Cloudant documents that have lastname = ‘Brown’ and location = ‘New York City, NY’. The document
fields that are shown are only firstname, lastname, and location. Some advanced operators can be used in the
Cloudant query, such as the $eq (equal) and $gt (greater than) operators that are used to search for documents.
You can use Cloudant endpoints to create, list, update, and delete indexes in a database, and to query data by using these
indexes.
The JSON body that is provided shows an example of a create index request body. In this example, an index of type text is
created for a field that is called “foo”. After the creation of this index, the Cloudant query that is used to search for Cloudant
documents by using the “foo” field in the query are more efficient and faster.
HTTP status codes:
Cloudant uses HTTP status codes that are returned in HTTP response headers.
More information might also be included in the response body area for the message.
The following example status codes adhere to the widely accepted status codes for HTTP:
• 200 - OK
• 201 - Created
• 400 - Bad request
• 401 - Unauthorized
• 404 - Not Found
For example, if you try to use https://$USERNAME.cloudant.com/$DATABASE/$DOCUMENT_ID to retrieve a document
that does not exist in the database, Cloudant responds with status code 404 in the header and other information about the
error is returned in the response as JSON, as shown in the slide.
The language-specific libraries often include error handling for these various cases.