0% found this document useful (0 votes)
2 views4 pages

Activity

The document is a group assignment for an Introduction to Emerging Technologies course, focusing on data science and its role in emerging technology. It covers definitions of data, information, and big data, as well as differences between data and information, data processing methods, data types, and characteristics of big data. Additionally, it discusses the big data life cycle, tools used in this process, and methods for computing over large datasets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views4 pages

Activity

The document is a group assignment for an Introduction to Emerging Technologies course, focusing on data science and its role in emerging technology. It covers definitions of data, information, and big data, as well as differences between data and information, data processing methods, data types, and characteristics of big data. Additionally, it discusses the big data life cycle, tools used in this process, and methods for computing over large datasets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Introduction to Emerging Technologies Course

Group Assignment

Name: Bemnet zenebe ID: 290433-16


Name: Eyerusalem Mekonn ID: 887579-16
Name: Bezawit Tahir

Activity 2.1
1.What is data science? Can you describe the role of data in emerging
technology?
 -Data science is a multi-disciplinary field that uses scientific
methods, processes, algorithms, and systems to extract
knowledge and insights from structured, semi-structured and
unstructured data.
2. What are data and information?
Data:
 Representation of facts, concepts, or instructions in a formalized
manner, which should be suitable for communication,
interpretation, or processing, by human or electronic machines.
Information:
 Processed data on which decisions and actions are based.
3. What is big data?
Big data is the term for a collection of data sets so large and
complex that it becomes difficult to process using on-hand
database management tools or traditional data processing
applications.
Activity 2.3
4. Discuss the main differences between data and information with
examples.
Data is raw, unprocessed facts, figures, or observations, while
information is data that has been processed, organized, and given
context to make it meaningful. Data is like the ingredients, while
information is the finished meal.

5. Can we process data manually using a pencil and paper? Discuss the
differences with data processing using the computer.
-computer processing offers greater speed, accuracy, and efficiency.
Activity 2.4
6. Discuss data types from programing and analytics perspectives.
• Integers(int)- is used to store whole numbers, mathematically known
as integers
• Booleans(bool)- is used to represent restricted to one of two values:
true or false
• Characters(char)- is used to store a single character
• Floating-point numbers(float)- is used to store real numbers
• Alphanumeric strings(string)- used to store a combination of
characters and numbers 7. Compare metadata with structured,
unstructured and semi- structured data
8. Given at least one example of structured, unstructured and semi-
structured data types
structured a table with rows and columns, like a spreadsheet or a
database table.
Unstructured such as email messages, videos, photos, webpages, and
audio files.
semi- structured include JSON and XML
Activity 2.5
9. Which information flow step in the data value chain you think is
labor-intensive? Why?
The information flow step in the data value chain that is most likely to
be labor-intensive is data collection. This is because data collection
often involves gathering raw data from various sources, which can
require significant manual effort.
10. What are the different data types and their value chain?
The Data Value Chain is introduced to describe the information flow
within a big data system as a series of steps needed to generate value
and useful insights from data
Activity 2.6
11. List and discuss the characteristics of big data
• Volume: large amounts of data Zeta bytes/Massive datasets
• Velocity: Data is live streaming or in motion
• Variety: data comes in many different forms from diverse sources
• Veracity: can we trust the data? How accurate is it?
12. Describe the big data life cycle. Which step you think most useful
and why?
-The big data life cycle involves several stages: data collection, data
storage, data processing, data analysis, and knowledge creation. The
most useful step is data analysis, as it transforms raw data into valuable
insights that can guide decisions and strategies
13. List and describe each technology or tool used in the big data life
cycle.
-Two commonly used tools are Apache Hadoop and MongoDB. Apache
Hadoop: Apache is the most widely used big data tool. It is an open-
source software platform that stores and processes big data in a
distributed computing environment across hardware clusters. This
distribution allows for faster data processing
14. Discuss the three methods of computing over a large dataset.
-data sampling, data chunking, and distributed computing

You might also like