ITECH 2201 Cloud Computing School of Science, Information Technology & Engineering

ITECH 2201 Cloud Computing
School of Science, Information Technology & Engineering
Workbook for Week 6
Exercise 1: Data Science (1 mark)
Read the article at http://datascience.berkeley.edu/about/what-is-data-science/ and

answer the following:
 What is Data Science?
Answer: The process which takes care of Packaging, delivering and organizing the
information or data is called 'Data Science'. The manipulation of data and making updates
to new package from raw data is called Packaging. Organizing data means management
of data storage. And Delivering takes care of whether data is delivered to get authority or
not.
 According to IBM estimation, what is the percent of the data in the world
today that has been created in the past two years?
Answer: According to IBM it is 90% of the data in the world today that has been
created in the past two years.
 What is the value of petabytestorage?
Answer: 1 Petabyte = 1015

OR
1 Petabyte = 1 million Gigabytes
For each course, both foundation and advanced, you find at
http://datascience.berkeley.edu/academics/curriculum/briefly state (in 2 to 3
lines) what they offer? Based on the given course description as well as from
the video.
Answer: The above given link indicates that Berkeley school of Information offers
the Master's degree in Information and Data science, which is further classified
into synthetic and Advanced capstone courses.
 Exercise 2: Characteristics of Big Data (2 marks)
Read the following research paper from IEEE Xplore Digital Library
Ali-ud-din Khan, M.; Uddin, M.F.; Gupta, N., "Seven V's of Big Data
understanding Big Data to extract value," American Society for Engineering
Education (ASEE Zone 1), 2014 Zone 1 Conference of the , pp.1,5, 3-5 April 2014
and answer the following questions:
 Summarise the motivation of the author (in one paragraph)
Answer: Author gave his research new heights by stating that “Big data has a lot
of potential in real world industry and research community” and he has explained
it by giving concept of 7 V’s. According the author Big data utilization will be
centre of focus for both researchers and students in upcoming years.
 What are the 7 v’s mentioned in the paper? Briefly describe each V in
one paragraph.
Answer: The 7 v’s mentioned in the paper are as follows:
• Volume:
It simply states size of the data. It refers to volume occupied by creation of
sources such as video, audio, text, social networking etc.
• Velocity:
The Velocity can be referred as speed of processing data. The performance

of processing big data depends on velocity which is dependent upon volume
of data.
• Variety:
The term variety means diversity in type of data. More complex the data is
more are the chances of occurring error.
• Veracity:
Truthfulness of data is known as veracity, furthermore, the trusted data

which has no duplicates.
• Validity:
It is same as Veracity but Validity refers to Correctness and accuracy which

are dependent upon intended usage.
• Volatility:
The retention policy of structured data which is being executed in an

organization is known as Volatility. Volatility is significantly affected by
Volume, Velocity and Varity.
• Value:
Last but not least, Value is most critical of all V's. Because all that matters
are output of process.
 Explore the author’s future work by using the reference [4] in the
research paper. Summarise your understanding how Big Data can
improvise healthcare sector in 300 words.
Answer: Now a day the Healthcare sector is widely adopting digitization in their
data analysing and it is growing day by day. Same time recognizing social
insurance industry, it is reasonable that it will produce an immense sum from
claiming information over approaching A long time. Enormous information might
be extemporized Previously, tolerant consideration Likewise numerous nations in
the exhibit planet are actualizing EHR (Electronic wellbeing Records) which will
make optimized and unified Toward tolerant majority of the data. Electronic
wellbeing Records will make a huge information that could a chance to be re-
identified What's more re-analysed for profitable data.
Sensors might a chance to be used to improve patient’s development Eventually

Tom's perusing enormous information engineering organization. For this
innovation patient’s fulfilment and the occasion when made by those doctor/nurse
should go to those tolerant will make reduced, those utilization of unreasonable
social insurance gear might be optimized. Separated from observation about
doctors or medical attendants on screen patients will make constant What's more
fully caught from anyplace Furthermore all over.
Big Data could be supportive should forestall wellbeing protection frauds and also
blacks Toward interfacing Different information sets that will provide for
protection operator organizations Furthermore healing facilities about riches data
on low down the fake measure included.
Despite enormous information obliges an immense measure for venture in any

case at the opposite side it facilitates chances for social insurance industry.
Furthermore, a standout amongst the greatest tests will a chance to be to ensure the
patient’s data and the security for information which needs should make secured.
Taking everything into account we might say that those social insurance
associations that need to actualize all the huge information if be watchful regarding
those protection and security Furthermore need with keep them ahead highest
priority on those rundown.
 Exercise 3: Big Data Platform (1 mark)
In order to build a big data platform one has to acquire, organize and analyse the
big data. Go through the following links and answer the questions that follow the
links:
• http://www.infochimps.com/infochimps-cloud/how-it-works/
• http://www.youtube.com/watch?v=TfuhuA_uaho
• http://www.youtube.com/watch?v=IC6jVRO2Hq4
• http://www.youtube.com/watch?v=2yf_jrBhz5w
Please note: You are encouraged to watch all the videos in that series from Oracle.
 How to acquire big data for enterprises and how it can be used?
Answer: Big Data may be besides an immense test to purpose the issues in an
association. The genuine issues for example, information flows, storage, analytics,
and intelligent media interfaces. Enormous information may be an accumulation
for information encompassed all over around a association like presents on Online
networking sites, on the web transaction records purchased/sold, advanced pictures
Also internet feature Entries. The reason for endeavour with procure huge
information will be to raise an arrangement will help the prerequisites for
coordination those system infrastructures, information storages Also progressing
business modulus.
Big data has two building components while collaborating to enterprise are
Hadoop
NoSQL.
 How to organize and handle the big data?
Answer: Big Data is a developing engineering organization for associations on

empower the closer search for their unstructured information because of which
expense What's more possibility will a chance to be successful. Associations that
bring figured out how will by any means impact huge information got it that they
if investigate their inputs from claiming data. Huge information camwood make
took care of Eventually Tom's perusing testing the information.
Though information will be a great deal greater and muddled clinched alongside
analysing after that it camwood a chance to be effortlessly maintained/ took care of
by inspecting that information under bits.
 What are the analyses that can be done using big data?
Answer: The analyses that can be done by using big data can be considered as
follows:
 Streaming data analysis

 Real time analysis
 No SQL and Ad hoc query analysis
 Elastic Ad hoc analysis and
 Batch analysis
Part B
(4 Marks)
Part B answers should be based on well cited article/videos – name the references
used in your answer.For more information read the guidelines as given in
Assignment 1.
 Exercise 4: Big Data Products (1 mark)
Google is a master at creating data products. Below are few examples from
Google. Describe the below products and explain how the large scale data is used
effectively in these products.
• Google’s PageRank
Answer: As name suggests it is survey of website pages and rank them

according to its importance. So it is termed as PageRank.
“According to Google, it is an algorithm used by Google search to rank
websites.”
• Google’s Spell Checker
Answer: It gives us the ability to write a text with correct spellings and
grammar. It is useful in mainly writing e-mails.
• Google’s Flu Trends
Answer: It gives the idea of Flu trends across the globe and predicts the
relevant activities carried out in different countries.
• Google’s Trends
Answer: As name suggests it gives us idea about current searching trends

across the world.
 Exercise 5: Big Data Tools(2 marks)
Briefly explain why a traditional relational database (RDBS) is not effectively

used to store big data?
Answer: RDBS is not all that successful to use to store huge information on the
grounds that RDBS is not up to the requests of enormous information yet. The
required information taking care of is soaring as far as volume. RDBS has
unbendable blueprint and enormous information requests for the high scope of
different information sorts.
 What is NoSQL Database?

Answer: NoSQL remains for Not Only SQL. NoSQL database is a component in
which stockpiling and reinforcement of information is considered and it has the
wide mixed bag in term of information sorts.
 Name and briefly describe at least 5 NoSQL Databases
Answer: Currently over 150 NoSQL Databases are used by many organizations.
Below are the five NoSQL Database categories which has sub categories adopted
by organization depending on their need.
 Column: Column situated database has superior and versatility, moderate

adaptability, low in intricacy and negligible with usefulness.
 Document: Document situated database has superior, versatility, adaptability
and low in intricacy and usefulness.
 Key-Value: This database manages superior, versatility and adaptability.
 Graph: Graph database is variable as far as execution and versatility, high in
adaptability and many-sided quality and usefulness is measured by graphical
hypothesis.
 Multi-model: This kind of NoSQL alluded to be direct with many-sided
quality, variable in execution, versatility, low in intricacy and usefulness
manages social algebra
 What is MapReduce and how it works?
Answer: MapReduce is an implementation associated programming model for

generating and processing data sets with the distributed and parallel algorithms.
MapReduce works under three steps:
• Map step
• Shuffle step
• Reduce step
 Briefly describe some notable MapReduce products (at least 5)

Answer: MapReduce has classified into 6 patterns, they are:
• Summarization
• Filtering
• Data Organization
• Join
• Metapatterns
• Input and Output
 Amazon’s S3 service lets to store large chunks of data on an online

service. List some 5 features for Amazon’s S3 service.
Answer: 5 features for Amazon’s S3 service are as follows:
• Durability
• Availability
• Security
• Integrity and
• High Performance.
 Exercise 6: Big Data Application (1 mark)
Name 3 industries that should use Big Data – justify your claim in 250 words
for each industry using proper references.
Answer: Three industries that should use Big Data are
• GE (General Electric)
• AYASDI
• IBM
General Electric: GE is known for having its superiority in making of electric
machines and appliances. It is a big data company with a vision to expand its
industrial internet on a large extent. In a joint venture with Accenture GE have
created software to backup data into cloud in railways and airlines.
AYASDI: AYASDI is taking visual approach to convert guesswork as an

output of big data.
It has created a visualization influential to help the Washington, DC which was

based on smart war tanks. The company has also made financial contribution i
developing 3D Map which has started a new trend.
IBM: IBM is contributing in business and education sector by its problem

solving skills. IBM has also launched a latest curriculum oriented program on
big data at schools.

Workbook for Week 7

Part A
(3 Marks)
Exercise 1: Storage Methods (1 mark)
From your lecture and also based on the below given video link:
https://www.youtube.com/watch?v=_sXkTSiAe-A
 Write a paragraph about memory virtualization.
Answer: In PC composing memory virtualization is a memory which is
not physical memory. Memory virtualization is done to save the
physical memory and its expenses. In a virtualization figuring
environment, officials can use virtual memory organization to apportion
additional memory to a virtual machine that has miss the mark on
resources. The instance of virtual memory is VMware programming
which allows the customers to make various working systems without
having the physical memory. Virtual memory licenses applications on
diverse servers to share data without replication.
Watch the below mentioned YouTube link:

https://www.youtube.com/watch?v=wTcxRObq738
Based on the video answer the following questions:
 What is RAID 0?
Answer: Assault remains for repetitive cluster of cheap circles, RAID
permits to have physical plates and one legitimate plate. It gives no
adaptation to internal failure or copy information. In RAID 0 the
catastrophe of one commute will bring about the whole exhibit to come
up short, subsequently add up to information misfortune. Attack 0 is
utilization to expand the circles' execution.
 Describe Striping, Mirroring and Parity.

 Answer: Striping is an arrangement of dividing legitimate consecutive
information, for example, document, the fragments are put away on an alternate
physical stockpiling gadget, Striping is defeated superior to bring information
much speedier than on a solitary stockpiling. These technique is utilization for
adjusting I/O load.
 Mirroring: - Mirroring is a technique to allow a mechanism to automatically
maintain multiple copies of data and information so that if the hard disk fail
machine can continue to process and recover the data.
 Parity: - Parity bit is included toward the end of a string of paired code to show
the quantity of bits in the string is even or odd, it is utilized for mistake
identifying code.
 Exercise 2: Storage Design (2 marks)

Summarize storage repository design based on the following video link:
https://www.youtube.com/watch?v=eVQH7C3nulY
Answer: A physical stockpiling equipment is said to be capacity vault is
sensible circle space is accessible over a document framework.
Capacity vault needs to meet two noteworthy prerequisites which are:
• NFS based vault
Below YouTube link describes the Intelligent Storage System

http://www.youtube.com/watch?v=1wENn4PDqDE
Based on the watched video answer the following questions:
 What is ISS?
Answer: ISS remains for canny stockpiling framework; it is a
component rich RAID cluster that gives profoundly improved I/O
handling capacities. ISS gives expansive measure of reserve and
numerous I/O ways that upgrades the execution, backings glimmer
drive, virtual provisioning and computerized stockpiling tiering.
 What are the 3 main components of the ISS?

Answer: Components of an ISS:- Front end, cache, back end, physical
disks.
 How cache works in ISS?

Answer: Reserve is the makeshift memory, where is the information is
put away briefly and get flush without anyone else's input. The
illustration of reserve is at whatever point we get to any site it gets store
is store memory, at the first occasion when it requires lot of investment
to recover the information and whenever it specifically get the
information from the store memory. Store memory is likewise called as
the quickest memory. At the point when a host gets a read demand if the
information is discovered it is called store hit and the information is sent
to the host without plate operation within few seconds. In the event that
the information is not discovered it is called as the store miss, reserve
miss expands the I/O time.
Storage Area Network (SAN) and Network Attached Storage (NAS) are
widely used concepts in data storage arena. The following YouTube
video links gives detailed description of these concepts:
− http://www.youtube.com/watch?v=csdJFazj3h0
− http://www.youtube.com/watch?v=vdf6CvGQZrk
− http://www.youtube.com/watch?v=MKZU8zOMiqE
Based on the watched videos answer the following questions:
 Describe NAS and SAN briefly using diagrams?

Answer: NAS: Network Attached Storage, A NAS gadget for the most part
coordinates processor in addition to circle stockpiling, it is appended to
a TCP/IP based system and got to utilizing specific document getting to
and record sharing conventions. Document solicitations got by a NAS
are made an interpretation of by interior processor to gadget demands.
SAN: Storage Area Network, it is a rapid system that interconnects and

introduces shared capacity gadgets to various servers. SAN are overseen
midway and fiber channel through which it's costly and hard to oversee.
 What are the advantages of SAN over NAS?

Answer: SAN is advanced for execution and versatility. A note worthy’s
portion potential advantages incorporate backing for fast Fibres Channel
media which is enhanced for capacity activity, dealing with numerous
circle and tape gadgets as a mutual pool with a solitary purpose of
control, particular reinforcement offices that can lessen server and LAN
use and different company support.
 What are two common NAS file sharing protocols? How they are
different from each other?
Answer: Two normal NAS record sharing conventions are Computer
Based NAS and integrated framework based NAS.
However, both are not the same as one another in numerous viewpoints,
for instance: The force utilization in PC based NAS is the biggest and
the execution is likewise the most effective similarly. What's more, at
the other hand inserted framework need MIPS arranged processor to run
the NAS server and force utilization for this sort is reasonable.
Part B
(3 Marks)
Exercise 3: Storage Design (1 Mark)
Design Storage Solution for New Application
Scenario
An association is conveying another business application in their
surroundings. The new application requires 1TB of storage room for
business and application information. Amid top workload, application is
relied upon to create 4900 IOPS (I/O every second) with average I/O
information square size of 4KB.
The seller accessible circle drive alternative is 15,000 rpm drive with 100
GB limit. Different details of the drives are:
Normal Seek time = 5 millisecond and information exchange rate = 40

MB/sec.
 You are required to figure the required number of circle drives that
can meet both limit and execution necessities of an application.
Hint: Keeping in mind the end goal to ascertain the IOPS from normal
look for time, information exchange rate, circle rpm and information
piece size allude slide 15 in week 7 address slide. When you have IOPS,
allude slide 16 in week 7 to figure the required number of plates.
 Exercise 4: Storage Evolution (2 Marks)

Watch the following videos for Fiber Channel over Ethernet and answer the
questions that follow:
− http://www.youtube.com/watch?v=hSFyf-rmjA8
− http://www.youtube.com/watch?v=iCfJCzfNLrw
 What is FCoE and why we need FCoE?

Answer: FCoE remains for Fiber Channel over Ethernet. It is a PC system
that has encased fiber outlines over Ethernet. FCoE gives more than
10GB Ethernet system to save fiber channel. FCoE is utilized to
decrease links, switches, force, cooling expenses, system interface and
IP networs.
In your opinion how FCoE is cost effective than traditional connection –

give brief explanation.
Answer: As I would like to think FCoE is a great deal more expensive than
a customary association in light of the fact that FCoE includes fiber
optic material.
On the off chance that there is break in association then FCoE will need to
supplant the entire association with new one.
You have read and answered about SAN in part A – based on your
understanding and with some research effort answers the following
questions:
 What is a Virtual SAN?
Answer: Virtual SAN remains for Virtual Storage Area Network; it is a
virtual fabric which has an accumulation of ports to set an
association between fiber channel switches. VSAN has a port by port
resized structure where as other fabric is resized by switch by switch.
What's more, it can be arranged in either routes by independently
and autonomously.
 What is IP SAN protocols and Fibre Channel over IP (FCIP)?

Answer: An Internet convention stockpiling region system permits a
few separates to distribute pools of driven square stockpiling gadgets
utilizing web convention. Fiber channel over web convention is
additionally spoken to as Fiber stockpiling burrowing and it has been
made by Internet designing team.
Watch the below video about Introduction to Object-based and Unified

Storage and:
http://www.youtube.com/watch?v=1SkUt7q8Dm8
Choose the correct answer from the following questions:
What is an advantage of a flat address space over a hierarchical address

space?
a. Highly scalable with minimal impact on performance
b. Provides access to data, based on retention policies
c. Provides access to block, file, and object with same interface
d. Consumes less bandwidth on network while accessing data
What is a role of metadata service in an OSD node?

a. Responsible for storing data in the form of objects
b. Stores unique IDs generated for objects
c. Stores both objects and objects IDs
d. Controls functioning of storage devices
What is used to generate an object ID in a CAS system?

a. File metadata
b. Source and destination address
c. Binary representation of data
d. File system type and ownership
What accurately describes block I/O access in a unified storage?

a. I/O traverse NAS head and storage controller to disk
b. I/O traverse OSD node and storage controller to disk
c. I/O traverse storage controller to disk
d. I/O is directly sent to the disk
What accurately describes unified storage?

a. Provides block, file, and object-based access within one platform
b. Provides block and file storage access using objects
c. Supports block and file access using flat address space
d. Specialized storage device purposely built for archiving
Workbook for Week 8
Part A
(3 Marks)
Exercise 1: Green Computing (0.5 Marks)

The questions in this exercise can be answered by doing internet search and/or
from the YouTube videos. Answer to each question should be one paragraph
in your own words.
 What is Greenhouse effect?

As the suns radiation achieves the Earth's environment, as a radiation's
percentage is reflected again into the space. As whatever remains of the suns
vitality is consumed by the earth and water (seas) as it is warming the earth.
As the warmth emanates from the earth towards the space environment. As
some piece of the warmth is caught by the nursery gasses in the air, as it keeps
the earth sufficiently hotter to live towards life. As by a human's percentage
exercises, for example, the everyday human exercises, for example, the
smouldering fossil powers, horticulture and area clearing through chopping
down trees as it makes are expanding the measure of nurseries gasses
discharging into the sky. As it is watching the additional warmth and it causes
the earth temperature to expand then with the ordinary times.
 What is Green IT and what are the benefits of greening IT?

The Green IT is term specified as the green data innovation as it was the act of
keeping up the certain ecological undertakings. The idea Green IT was
developed in the year 1992.
Green It objective is to will be to diminish the negative impact on data innovation
operations on nature by through outlining, fabricating, working and discarding
the electronic gadgets, machines and other PC related items in an ecological
benevolent way. The primary objective of the Green IT practices incorporates
decreasing the employments of the perilous materials, expanding vitality
productivity amid the items life time. Alternate parts of Green IT incorporate
The Redesign of the server farms and the developing fame of virtualization,
Green systems administration and distributed computing.
The Benefits of Greening IT:
 Decreasing the carbon outflows

 Increasing cooling effectiveness in the server farm
 Limiting the vitality costs
 Companies Saving gigantic measure of cash.
Exercise 2: Environmental Sustainability (0.5 Marks)
Read the article in the below link and answer the questions that follow:
http://www.computer.org/csdl/mags/it/2010/02/mit2010020004.html
 According to the article how do you build a greener environment?

According to the article to assemble a green situation as it ought to be re changed
and to stop up with the more established and utilized methods for doing
procedure. As It ought to be received with the comprehensive methodology as
in to be sure it makes the data innovation life cycle Green way. As the greener
environment can be constructed by the four distinctive way ways.
 It is through the Green utilization
 It is through the Green transfer
 It is through the Green configuration
 It is through the Green assembling
As Through the green utilization it can be decreased through the vitality utilizing
of the server farms, PCs, data frameworks and it utilized as a part of an
ecological sound way. Getting through the utilization of green transfer it
should be possible through re utilizing the old PCs and viably reusing waste
PCs and other electronic gears similar to console, mouse, cpu outer hard
plates.
As favouring with the green configuration it ought to be composed in an effective
way and ecologically solid parts like, PCs , servers cooling gear , server farms
. Similarly, as with the Green assembling it ought to be make the electronic
segments, PCs, and other related subsystems with the low effect or impact on
the environment.
 Summarize the article in 150 words

Answer: The information technology is acting as a solution to the problems
and also it is creating problems at the other end for environmental
sustainability. As the information technology is effecting the environment in
certain different forms. As starting with the development of computers
(manufacturing), and other electronic gadgets, computer parts like keyword,
chips, hard disks, cables. As these non-electronic and electronic components
utilizes electricity, chemicals water indeed it generates hazardous wastes. As
this kind of problems makes the information technology to make it as Green
information technology in order to environmental sustainability.
As the Green Information technology refers to develop a green

environment in order to avoid environmental problems as that are casing from
the IT. As it proposed some several paths for greening IT.
 Green use:
 Green disposal:
 Green design:
 Green manufacturing:
As The green use refers to using the energy in data centres, information and
computers systems as by using them in an environmental friendly manner.
Green disposal refers to the use of the unwanted computers and responsibly
recycling the old computers and other electronic devices. Green design:
Designing the system in the energy efficient manner and the environmental
sound components computers, servers, hard disks. Green manufacturing: As
the green manufacturing refers to the manufacturing and developing the
electronic parts, computers, and the other associated subsystems with less
effects on the environment.
Exercise 3: Environmentally Sound Practices (1 Mark)
The questions in this exercise can be answered by doing internet search.
Briefly explain the following terms – a paragraph for each term:
 Power usage effectiveness (PUE) and its reciprocal

The force utilization adequacy is a recipe it is used to know the vitality proficiency of a server
farm. It can be precisely known by isolating the measure of force entering a server farm as by
the force used to run the PC framework within it. Server farm foundation proficiency (DCIE)
is the corresponding to the force use adequacy
 Data centre efficiency (DCE)
The server farm productivity is the term which is utilized to expand the effectiveness of
information so as to deal with the huge information keeping in mind the end goal to
bolster and run information applications for utilizing as a part of everyday exercises. As
the server farm proficiency was developed because of as the span of the critical
information created to develop as it is expanding the requirement for the business to reach
lavish and control use innovation to bolster and run the information applications.
 Data centre infrastructure efficiency (DCiE)

Server farm framework effectiveness was created by the GREEN GRID individuals as it
can be termed in a rate which is termed and ascertained by separating data innovation gear
power by aggregate office power.
Data centre infrastructure efficiency = IT EQUIPMENT POWER/ TOTAL FACILITY

POWER * 100%
List 5 universities who offers Green Computing course. You should name the university,
the course name and the brief description about the course.
The green figuring can be characterized as the it is the study and routine of planning ,
fabricating use and destroying the desktops , servers and related subsystems, for example,
printers , capacity gadgets organizing adornments and speakers, earphones in a viable and
efficiency with no mischief to the earth.
Ref: https://www.google.com.au/?
gfe_rd=cr&ei=KbL0Var3BKfu8wfMmoHwCw#q=green+computing+definition
Exercise 4: Major Cloud APIs (1 Mark)

The following companies are the major cloud service provider: Amazon,
GoGrid, Google, and Microsoft.
List and briefly describe (2 lines) the APIs provided by the above major
vendors.
Firstly accompanying the Amazon Company it was propelled an APIS known as

the EC2 API. Amazon flexible figure cloud as it is a web administration that
gives resizable tasking capacity in the cloud. It is intended to make web scale
cloud tasking entirely simple to the programming engineers. It is a simple
web administration interface permits getting to and arranging limit with the
less erosion.
As it gives portrayals, linguistic structure and use illustrations for each of the
activities and information sorts for amazon EC2 .
As coming to second organization Go Grid as it built up an API known as the Go

network v2 API it is Full Rest interface, occasion it is an inquiry interface.
Go Grid V@ API as a few technique brings are made over the web by
sending the Hyper Text Transfer Protocol Service system like put , post ,
GET, DELETE solicitations to the Go GRID API REST administrations.
As accompanying the third organization google as this organization built up an

API known as the Google Maps API , as the google maps API are accessible
for Android , IOS , web programs As by utilizing google maps API one can
get bearings on street to the specific unknowing spot.
As at long last accompanying the Microsoft organization it was created an API

known as the Speech Application programming interface or SAPI it used to
known the utilization of discourse acknowledgment and discourse blend
inside of windows applications. Every one of the variants of the API have
been produced as that the software engineer can compose the application to
perform discourse Reorganize.
https://en.wikipedia.org/wiki/Microsoft_Speech_API
https://developers.google.com/maps/?hl=en
Part B
(3 Marks)
Exercise 2: Green cloud computing (0.5 Marks)

Xiong, N.; Han, W.; Vandenberg, A, "Green cloud computing schemes based
on networks: a survey," Communications, IET, vol.6, no.18, pp.3294,3300,
Dec. 18 2012
Most part of power consumption in data centres comes from computation processing, disk
storage, network and cooling systems. Nowadays, there are new technologies and
methods proposed to reduce energy cost in data centres. From the above paper summarize
(in 300 words) the recent work done in these fields.
As from the article there are new advances and strategies proposed to diminish vitality
costs in the server farms as there are some current systems to distributed using so as to
compute greener as chip as the microchips assume a key part in undertaking planning as it
makes the registering procedures very simpler. As it made server farms to handle a lot of
errands consistently it entirely less demanding to handle the assignment as much simpler,
to be sure it made effectively plan the undertaking in the green distributed computing.
As by utilizing virtual machines it is one of the viable ways to deal with diminish the
force utilization and to allot assets. As by utilizing system topologies the system outline as
inside the server farm is very not quite the same as the web and p2p. It ought to be
composed in view of the
On the tenets as increasing so as to enhance the topology as the network. As by utilizing

the cooling frameworks server farms can designate a large number of servers as it prior
one of the test in distributed computing in light of the fact that cooling turns into an issue
in the information
Focuses. Similarly as with utilizing circle stockpiling it guarantees positives of the online
stockpiling concerning less utilization of the force use as it still a subject of green
distributed computing. As to make distributed computing greener, planning calculations
are utilized as a part of the server farms . As a calculations' portion utilized as a part of the
server farms are as per the following first start things out served calculation As the FCFS
booking calculation consequently executes lined solicitations and it forms by the request
of their entry. As there are a few positives and negatives with the first start things out
serve calculation.
Positives:
 It is easy to execute and has generally safe.

 It makes normal measure of time entirely less every procedures needs to hold up until the
every procedure to execute.
Round Robbin calculation:
It is the least complex and essential calculation as that uses the idea of time cuts .
As in these calculation the time is partitioned into the different cuts and every hub is
given a specific time interim and in this time interim the hub will perform its operations.
Positives:
It lives up to expectations all the more proficiently for brief time occupations, as it is not
really utilized for timesharing frameworks
Negatives:
 It is high hazard because of incessant setting switches.

 It is expanding in normal holding up
Exercise 3: Cloud API Functionalities (2 Marks)

List the functionalities that can be achieved by using the APIs mentioned in
the following link:
https://code.google.com/p/sainsburys-nectar-api/
The functionalities that can be accomplished by utilizing the APIS as it is corresponds

with the official nectar administration , as alternate functionalities that can be
accomplished as recovering bookkeeping points of interest , recovering offers , how to
pick into offers , conditions . As from recovering record points of interest like record. Get
Name ()): account. Get currency (); account. Get points-balance(); account. Get money
related value ()); account. Get record sort (); offer. Get offer-id(); offer. Get legitimate
from(): offer. Get substantial till ().
What API is used in the following link and how it is used?
https://pypi.python.org/pypi/python-novaclient
The use of API as a part of the accompanying connection is python–nova
customer 2.29.0. As the python nova customer is authorized under the apache
permit like whatever remains of open stack. As by introducing the charge line
API gets the shell summon as it called as nova, that you can use to associate
with any open stack cloud. As we can utilize it by writing os username os
secret key and after that we have to characterize with confirmation url with
os-auth-url and adaptation of the API with os–compute-programming
interface variant, then we have to indicate other one that is district name (OS-
REGION-NAME). As we locate the complete documentation on the shell as
by running now Open stack is an open source collaborative software project
which meets many of the cloud needs.
Below links gives vast information about Open stack.
https://support.rc.nectar.org.au/docs/openstack
http://docs.openstack.org/ap/quick-start/content/
Write a report (2 pages) about the Open stack features and functionalities.
Open stack has different undertakings each they could call their own
code names as like
 Compute can be called as nova

 Object capacity can be called as movement.
 Image administration can be called as look.
 Identity can be called as cornerstone
 Dashboard can be called as skyline
 Network can be called as neutron
 Block capacity can be called as ash
 Metering can be called ceilometer
 Orchestration can be called as warmth
Open stack as different assets like
 Project home
 Open stack docs
 It as assets like API Reference as it is the nuts and bolts for validating and utilizing
the figure and picture APIS
 Programming interface referencing is only the application programming interfaces, as
it offers the way to use the capacities of an administration as by utilizing the
predefined capacities , as it as a references' percentage like square stockpiling API v2,
register APIV2.1(CURRENT) , Compute APIV2 (bolstered) , figure API v2
augmentations upheld .character API V3(CURRENT)
 API Quick begin as it is manual for utilizing the open stack APIs
 Open stack customers official and creating summon line customers
 Open stack blog it is an accumulation of considerations from the designers and the
other key players of the open stack ventures
 Research cloud client discussion: IT is an open piece anyone can provide for offer
encounters.
 It has open stack documentation for occurrence it as documentation for kilo
 It has tryout open stack as it utilized like an asset equipment which is uninhibitedly
accessible to test open stack applications.
Open stack API speedy begin:
As to speedy begin with the open stack APIS ask for as we have to utilize a few
strategies as it first accompanies the C url it is an order line device that lets us
to send Hypertext exchange demands and to get reactions .
It has open stack summon line customers as every open stack task gives an order
line customers that makes us to get to its APIS through simple to utilize
orders.
It as REST customers like Mozilla and google as both those offers program based
graphical interface for Rest.
As in Mozilla REST customers bolsters all HTTP techniques RFC
2616(HTTP/1.1) and RFC 2518(WebDAV). As we can build custom HTTP
ask for (custom strategy with assets URL and HTTP solicitation body) to
straightforwardly test solicitations against a server.
It has open stack python programming advancement unit: As these uses
programming improvement pack to make the python robotization scripts that
can make and oversee assets as in our open stack cloud .The product
improvement unit executes python ties to the open stack APIS which makes
us to perform the mechanization lives up to expectations in python as by
making solutions and approaches the python questions as opposed to make the
rest calls straightforwardly. All of the open stack summon line apparatuses are
executed as by utilizing the python programming improvement unit.
For the validation and API Request work process it as parameter, sort,
Description.
It as to send a demand a validation token from the character endpoint as that

cloud manager gives. As to send a payload of accreditations as in the
solicitation as we have to use as the parameter sort depiction.
Parameter Type Description
username The user name. If you do not provide a user name

xsd:string
(required) and password, you must provide a token.
password
xsd:string The password for the user.
(required)
The tenant name. Both

the tenantId and tenantName are optional, but
tenantName
xsd:string should not be specified together. If both attributes
(Optional)
are specified, the server responds with a 400Bad
Request.
tenantId capi:UUI The tenant ID. Both

(Optional) D the tenantId and tenantName are optional, but
Parameter Type Description
should not be specified together. If both attributes

are specified, the server responds with a 400Bad
Request. If you do not know the tenantId, you
can send a request with "" for the tenantId and get
the ID returned to you in the response.
token capi:UUI A token. If you do not provide a token, you must

(Optional) D provide a user name and password.
As though the solicitation get succeeded the server gives back a validation
token. As by sending the API asks for as it incorporates the token in the x-
auth –token header. As by keep on sending the API asks for with that the
token until the occupation gets completed o

ITECH 2201 Cloud Computing School of Science, Information Technology & Engineering

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

ITECH 2201 Cloud Computing School of Science, Information Technology & Engineering

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ITECH 2201 Cloud Computing School of Science, Information Technology & Engineering

Uploaded by

Copyright:

Available Formats

ITECH 2201 Cloud Computing

School of Science, Information Technology & Engineering

Workbook for Week 6

Exercise 1: Data Science (1 mark)

Read the article at http://datascience.berkeley.edu/about/what-is-data-science/ and

 What is Data Science?

 What is the value of petabytestorage?

Answer: 1 Petabyte = 1015

 Exercise 2: Characteristics of Big Data (2 marks)

and answer the following questions:

 Summarise the motivation of the author (in one paragraph)

Answer: The 7 v’s mentioned in the paper are as follows:

The Velocity can be referred as speed of processing data. The performance

Truthfulness of data is known as veracity, furthermore, the trusted data

It is same as Veracity but Validity refers to Correctness and accuracy which

The retention policy of structured data which is being executed in an

Sensors might a chance to be used to improve patient’s development Eventually

Despite enormous information obliges an immense measure for venture in any

 How to organize and handle the big data?

Answer: Big Data is a developing engineering organization for associations on

 Streaming data analysis

 Exercise 4: Big Data Products (1 mark)

Answer: As name suggests it is survey of website pages and rank them

• Google’s Spell Checker

• Google’s Flu Trends

Answer: As name suggests it gives us idea about current searching trends

 Exercise 5: Big Data Tools(2 marks)

Briefly explain why a traditional relational database (RDBS) is not effectively

 What is NoSQL Database?

 Name and briefly describe at least 5 NoSQL Databases

 Column: Column situated database has superior and versatility, moderate

 What is MapReduce and how it works?

Answer: MapReduce is an implementation associated programming model for

MapReduce works under three steps:

 Briefly describe some notable MapReduce products (at least 5)

• Input and Output

 Amazon’s S3 service lets to store large chunks of data on an online

Answer: 5 features for Amazon’s S3 service are as follows:

 Exercise 6: Big Data Application (1 mark)

Answer: Three industries that should use Big Data are

AYASDI: AYASDI is taking visual approach to convert guesswork as an

It has created a visualization influential to help the Washington, DC which was

IBM: IBM is contributing in business and education sector by its problem

ITECH 2201 Cloud Computing

Workbook for Week 7

Watch the below mentioned YouTube link:

 Describe Striping, Mirroring and Parity.

 Exercise 2: Storage Design (2 marks)

Below YouTube link describes the Intelligent Storage System

 What are the 3 main components of the ISS?

 How cache works in ISS?

 Describe NAS and SAN briefly using diagrams?

SAN: Storage Area Network, it is a rapid system that interconnects and

 What are the advantages of SAN over NAS?

Normal Seek time = 5 millisecond and information exchange rate = 40

 Exercise 4: Storage Evolution (2 Marks)

 What is FCoE and why we need FCoE?

In your opinion how FCoE is cost effective than traditional connection –