0% found this document useful (0 votes)
12 views

Week 3 Assignment

Uploaded by

okeowomuhmeen
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Week 3 Assignment

Uploaded by

okeowomuhmeen
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

NAME: MUMUNI OKE

Fellow ID: FE/23/63820632


ALC: Grazac Technologies Limited (Cohort 2)
Module 3 Assignment.

Question : Provide an overview of the different type of datasets used in data analysis (Stuctured,
Unstructured, and Semi Stuctured).

Structured data
Structured data are data whose elements are addressable for effective analysis. It has been organized
into a formatted repository that is typically a database. It makes up about 10% - 20% of generated data
and has clearly defined data types & patterns that makes them easily stored and organized into row and
columns. It is usually stored in relational database e.g SQL or Spread Sheets. They have relational keys
and can easily be mapped into pre-designed fields. Today, those data are most processed in the
development and simplest way to manage information. Example: Relational data.

Semi-Structured data
Semi-structured data is information that does not reside in a relational database but that has some
organizational properties that make it easier to analyze. With some processes, you can store them in the
relation database (it could be very hard for some kind of semi-structured data), but Semi-structured
exist to ease space. Example: XML data.

Unstructured data
Unstructured data is a data which is not organized in a predefined manner or does not have a
predefined data model, thus it is not a good fit for a mainstream relational database. So for
Unstructured data, there are alternative platforms for storing and managing, it is increasingly prevalent
in IT systems and is used by organizations in a variety of business intelligence and analytics applications.
It makes up about 80% of generated data and cannot be organised Example: Word, PDF, Text, Media
logs.
Differences between Structured, Semi-structured and Unstructured data:

Properties Structured Data Semi-structured Data Unstructured Data

Technology It is based on It is based on It is based on character


Relational database XML/RDF(Resource and binary data
table Description
Framework)

Transaction Matured transaction Transaction is adapted No transaction


Management and various from DBMS not management and no
concurrency matured concurrency
techniques

Flexibility It is schema dependent It is more flexible than It is more flexible and


and less flexible structured data but less there is absence of
flexible than schema
unstructured data

Scalability It is very difficult to It’s scaling is simpler It is more scalable.


scale DB schema than structured data

Query Performance Structured query allow Queries over Only textual queries
complex joining anonymous nodes are are possible
possible

You might also like