0% found this document useful (0 votes)
1K views

Assignment Data Science

The document discusses key concepts in data science including: 1. Data science transforms data into useful information through scientific methods, statistics, and algorithms. 2. The data processing cycle involves collecting data from various sources, preparing it, inputting it into a computer, processing it, outputting it to users, and storing it for future use. 3. The data value chain demonstrates obtaining value from data through acquisition, analysis, curation, storage, and usage. It involves gathering, cleaning, exploring, modeling, managing, recording, and accessing data for decision making.

Uploaded by

Jillian Noreen
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views

Assignment Data Science

The document discusses key concepts in data science including: 1. Data science transforms data into useful information through scientific methods, statistics, and algorithms. 2. The data processing cycle involves collecting data from various sources, preparing it, inputting it into a computer, processing it, outputting it to users, and storing it for future use. 3. The data value chain demonstrates obtaining value from data through acquisition, analysis, curation, storage, and usage. It involves gathering, cleaning, exploring, modeling, managing, recording, and accessing data for decision making.

Uploaded by

Jillian Noreen
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Assignment 2: Data Science

1. What is Data Science?


● Data science is a field of study which uses scientific methods,
statistics, and algorithms to acquire information and insights from
various types of data, including structured, semi-structured, and
unstructured data. To put it simply, data science transforms data
into useful information. Data science is important because data by
itself is not fully useful unless you turn it into information.

2. Explain Data Processing Cycle?


● The data processing cycle comprises a series of steps necessary for
transforming data into information. The data processing cycle
begins with data collection. It is the stage wherein you gather the
data from multiple sources using various data collection methods
and techniques. The next step in the data processing cycle is data
preparation. In this stage, data is cleaned to exclude inaccurate data
and is organized to prepare data for processing. The third step is
data input. The cleaned data from the previous step is converted into
machine-readable form by inputting it into a computer. The fourth
step is processing. In this stage, the data will be processed using
various data processing methods and techniques. The fifth step is
data output, wherein the processed data is delivered to the user in a
readable and summarized format. Finally, the last stage of the data
processing cycle is data storage, wherein the data is stored for future
use in a way that the user can retrieve the data easily in any desired
format.

3. Explain Data Value Chain.


● The data value chain is a conceptual framework used in data science
that demonstrates a series of activities to obtain value and useful
insights from data. The data value chain is comprised of the
following activities:
○ Data Acquisition - It entails the process of gathering, and
cleaning data before it is stored in a data warehouse or any
other storage solution on which data analysis can be
performed.
○ Data Analysis - It is the process of exploring, transforming, and
modeling data to obtain useful information that can be used
for decision-making.
○ Data Curation - It involves processes required to actively
manage data, ensuring that it fits the requirements for its
effective usage. Data curation includes processes such as
selection, categorization, transformation, validation, and
preservation of data.
○ Data Storage - It is the process of recording data in a storage
solution to ensure its security and accessibility for future use.
○ Data Usage - It involves the process of accessing data, its
analysis, and the tools needed for business activities and
decision making.
4. What is Big Data?
● Big data is a vast and multifaceted collection of data that by no
means any traditional data management tools can store, process,
manage, and retrieve efficiently, thus, the only solution for
managing this fast-growing data is by utilizing multiple computers
and robust computer softwares for processing. Big data has three
defining properties, known as the three Vs; volume, velocity, and
variety. The volume in big data pertains to the large amount of data
being produced at every moment. Another property of big data is
variety, which relates to data having various formats, such as
structured, semi-structured, unstructured, etc. And lastly, the
velocity in big data refers to the speed of data generation.

You might also like