0% found this document useful (0 votes)
143 views16 pages

Data Acquisition

Data acquisition involves collecting raw data from relevant sources. There are two main types of data: numeric data (discrete or continuous) and text data. Data can also be structured, unstructured, or semi-structured depending on whether it has a defined structure or pattern. Some other types of data include time-stamped, machine, spatiotemporal, open, real-time, and big data. Training data refers to input data used to train a system while testing data is the processed output data used to test or evaluate a system.

Uploaded by

FroFee F
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
143 views16 pages

Data Acquisition

Data acquisition involves collecting raw data from relevant sources. There are two main types of data: numeric data (discrete or continuous) and text data. Data can also be structured, unstructured, or semi-structured depending on whether it has a defined structure or pattern. Some other types of data include time-stamped, machine, spatiotemporal, open, real-time, and big data. Training data refers to input data used to train a system while testing data is the processed output data used to test or evaluate a system.

Uploaded by

FroFee F
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 16

Data Acquisition

Data Acquisition
Understanding data acquisition:
 Data Acquisition consists of two words:
1. Data : Data refers to the raw facts ,
figures, or piece of facts, or statistics
collected for reference or analysis.
2. Acquisition: Acquisition refers to
acquiring data for the project. The stage of
acquiring data from the relevant sources is
known as data acquisition.
Classification of Data
Basic Data
Basically, data is classified into two categories:
1. Numeric Data:
Mainly used for computation.
Numeric data can be classified into the following:
Discrete Data
Continuous Data
o Discrete Data: Discrete data only contains
integer numeric data.
It doesn't have any decimal or fractional value.
The countable data can be considered as discrete
data.
For example 132 customers, 126 Students etc.
o Continuous Data:
It represents data with any range. The
uncountable data can be represented in
this category.
For example 10.5 KGS, 100.50 Kms etc.
2. Text Data: mainly used to represent
names, collectition of words together,
phrases, textual information etc
Structural Classification
 The data which is going to be feed in the
system to train the model or already fed in
the system can have a specific set of
constraints or rules or unique pattern can be
considered as structural data.
The structure classification is divided into 3
categories:
Structured Data:
Unstructured Data:
Semi-Structured Data:
1. Structured Data:

The structured data can have a specific


pattern or set of rules.
These data have a simple structure and
stores the data in specific forms such as
tabular form.
Example, The cricket scoreboard, Your
school time table, Exam datasheet etc.
2. Unstructured Data:

The data structure which doesn't have any


specific pattern or constraints as well as
can be stored in any form is known as
unstructured data.
Mostly the data that exists in the world is
unstructured data.
Example, Youtube Videos, Facebook
Photos, Dashboard data of any reporting
tool etc.
3. Semi-Structured Data:
It is the combination of both structured
and unstructured data.
Some data can have a structure like a
database whereas some data can have
markers and tags to identify the structure
of data.
Other Classification This classification is
sub divided into the following branches:
1. Time-Stamped Data:
This structure helps the system to predict
the next best action.
It is following a specific time-order to
define the sequence.
This time can be the time of data captured
or processed or collected.
2.Machine Data:
The result or output of a specific program,
system or technology considered as
machine data.
It consists of data related to a user's
interaction with the system like the user's
logged-in session data, specific search
records, user engagement such as
comments, likes and shares etc.
3.Spatiotemporal Data:

The data which contains information


related to geographical location and time
is considered as spatiotemporal data.
It records the location through GPS and
time-stamped data where the event is
captured or data is collected.
4.Open Data: It is freely available data for
everyone. Anyone can reuse this kind of
data.
5. Real-time Data: The data which is
available with the event is considered as
real-time data.
6. Big Data:
The data which cannot be stored by any
system or traditional data collection
software like DBMS or RDBMS software
can be considered as Big data.
Data Features
Data features refer to the type of data to be
collected.
Here two terms are associated with this:
1. Training Data:
2. Testing Data:
1. Training Data:
The collected data through the system is
known as training data.
The input given by the user in the system
can be considered as training data.
2. Testing Data:

The result data set or processed data is


known as testing data.
The output of the data is known as testing
data.

You might also like