Emrging
Emrging
(CoSc1012)
Data Science
Data Acquisition Data Analysis Data curation Data Storage Data Usage:
requires:
❑ managing cluster membership, coordinating resource sharing and
❑ scheduling actual work on individual nodes.
Cluster membership and resource allocation can be handled by software like Hadoop’s
YARN. “Yet Another Resource Negotiator”.
Intr. to Emerging Tech.
prep: Werkneh E.(Msc in Comp Net)
2.5.2.2.Hadoop and its Ecosystem
Open source Software for distributed
storage and processing, of large
datasets across clusters of computers
Four key characteristics of Hadoop are:
Economical: ordinary computers can be used
for data processing.
Reliable: stores copies of the data on different
machines and is resistant to hardware failure.
Scalable: It is easily scalable, Horizontal and
vertical