0% found this document useful (0 votes)

12 views19 pages

Dam Unit - V

The document provides an overview of data roles in organizations, emphasizing the importance of data in decision-making, customer satisfaction, and operational efficiency. It discusses various types of data, including structured, unstructured, and semi-structured data, as well as the significance of data modeling and its types, such as conceptual, logical, and physical data models. Additionally, it highlights the role of data modeling in organizing data, improving data quality, and supporting decision-making processes.

Uploaded by

sirikotibhoopathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views19 pages

Dam Unit - V

Uploaded by

sirikotibhoopathi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

DAM UNIT - V BA III YEAR

UNDERSTANDING DATA: Overview: Identify data roles in the organization -

Determine how data moves through the data lifecycle - Data Modeling: Identify the
role of data modeling in the organization - Analyze data modeling techniques - Use
tools for data modeling - Structured Data Tools: Identify core tools for RDBMS’s
(structured storage) - Use SQL to perform CRUD tasks against a database -
Unstructured Data Tools: Identify tools in unstructured stack - Use tools for
unstructured data management

Q) What is Data? Identify role of DATA in organization.

A) Data refers to raw facts, statistics, or information that is collected, stored, and
processed. It can take many forms, including numbers, text, images, sounds, or any other
representations of facts. Data is the foundation for all information and knowledge and is
used in a wide range of applications in various fields, including science, business,
technology, and everyday life.

Data serves as the basis for generating information, insights, and knowledge when it is
processed, analyzed, and interpreted. In the context of organizations and decision-making,
data is a valuable asset that helps in understanding patterns, trends, and relationships,
ultimately leading to more informed and data-driven decisions.
These categories include:
Structured Data :
Structured data is a type of data that is organized and easily managed using traditional
data management tools such as spreadsheets, databases, or tables. Structured data is
typically quantitative and numeric in nature, meaning that it consists of numbers,
percentages, and other numerical values. Because of its organized nature, structured data
is relatively easy to analyze using statistical methods such as regression analysis or
correlation analysis.
Unstructured Data :
Unstructured data is data that does not have a predefined format or organization, making
it difficult to manage using traditional data management tools. Examples of unstructured
data include social media posts, emails, images, and videos. Because of its unstructured
nature, unstructured data is typically qualitative in nature, meaning that it is descriptive
and narrative in nature. Analyzing unstructured data requires the use of advanced
analytics techniques such as natural language processing (NLP) or sentiment analysis.
Semi-Structured Data :
Semi-structured data is a type of data that has elements of both structured and
unstructured data. This type of data includes information that is partially organized, but
not to the extent that it can be classified as structured data. Examples of semi-structured
data include XML and JSON files, which have some organization but also contain
elements of unstructured data. Analyzing semi-structured data typically requires a
combination of traditional data management tools and advanced analytics techniques.

Big data: Refers to massive data sets that need to be analyzed using advanced software
to reveal patterns and trends. It is considered to be one of the best analytical assets as it
provides larger volumes of data at a faster rate.

NEELIMA pg. 1
DAM UNIT - V BA III YEAR
Metadata: Putting it simply, metadata is data that provides insights about other data. It
summarizes key information about specific data that makes it easier to find and reuse for
later purposes.

Real time data: As its name suggests, real time data is presented as soon as it is
acquired. From an organizational perspective, this is the most valuable data as it can help
you make important decisions based on the latest developments. Our guide on real time
analytics will tell you more about the topic.

Machine data: This is more complex data that is generated solely by a machine such as
phones, computers, or even websites and embedded systems, without previous human
interaction.

Open data : Open data is data that is freely available to anyone in terms of its use (the
chance to apply analytics to it) and rights to republish without restrictions from
copyright, patents or other mechanisms of control. The Open Data Institute states
that open data is only useful if it’s shared in ways that people can actually understand. It
needs to be shared in a standardized format and easily traced back to where it came from.

Q) Identify role of Data in the organization.

A) Business data is the collective information related to a company and its operations. This
can include any statistical information, raw analytical data, customer feedback data, sales
numbers and other sets of information. Businesses often collect as much data as possible
on their operations to use that data to help streamline operations and learn more about
customer needs so they can better serve their audience. Collecting business data can mean
polling customers, using analytical software or simply observing information.
Data is an essential asset for modern businesses because it:
1. Helps businesses make decisions

One of the most compelling reasons to collect data for businesses is that data can help a
company make better decisions. Decisiveness can be a useful trait for a business, because
it can help the company make tough decisions more quickly and understand the
repercussions or benefits of decisions. For example, if a company wants to expand into a
new market, collecting data is a necessity because the company needs information on how
the market works, where it might fit into that market and what kinds of customers it might
serve once entrenched in that new market.

2. Improves customer satisfaction

Customer satisfaction can help improve customer loyalty and trust, which may increase
sales and customer referrals. With raw data, businesses can study the effects of their efforts
on customer satisfaction and learn where they can improve. This can help the company
create a more pleasing, customized experience for each customer, helping to separate that
business from the competition. For example, a business might poll its customers with a
short digital survey after each purchase to ask questions about their experience. The
company can use that data to identify positive or negative trends and take action.

NEELIMA pg. 2
DAM UNIT - V BA III YEAR
3. Increases revenue and profits

Data may also help a company increase their revenue and profits by making the company
more efficient, providing key insights into operations and customer satisfaction and
helping to improve certain processes. Data can help businesses measure whether certain
actions, products or services are profitable and where their greatest expenses might be.
Identifying expenses is often the key to increasing profits because businesses can reduce
those expenses and keep more of the revenue they earn. Raw data helps the company
identify where it can trim expenses, increase efforts and earn more revenue.

4. Helps with problem-solving

Data also plays a critical role in problem-solving for company leaders. With an abundance
of data, company leaders can identify and address key problems and monitor the effects of
proposed solutions. For example, if a company identifies an issue in its manufacturing
processes, it might collect data on how much each unit costs to produce and how much
revenue they're losing with reduced production. Solving problems can be much easier and
solutions are more effective when the person solving that problem has sufficient
information. Understanding the problem in its entirety is typically the first step toward
solving that problem.

5. Improves company processes

Company processes may also benefit from data, as it can show company leaders how
efficient or costly certain processes are. With this data, the company's executives can
determine how to make processes like manufacturing or customer service more efficient.
For example, a company might collect data on its marketing process and find that it's
allocating 50% of the marketing budget to social media campaigns that aren't generating
qualified leads. With this information, the company can determine whether to pull the
funding from that method and allocate it elsewhere for a better return on the investment
and continued financial savings.

6. Resource Allocation: Data assists in the allocation of resources such as budget,

personnel, and time. It helps organizations prioritize projects and initiatives based on their
expected impact and ROI.
7. Employee Performance: Data can be used to assess employee performance, identify
training needs, and develop strategies for talent management and retention.

8. Security and Fraud Detection: Data is crucial for identifying security threats and
detecting fraudulent activities within an organization's systems.

Q) What is Data Model? Explain types Of Data models

A) A data model is a blueprint that describes the internal structure of an organization’s

information. Data models ensure that all internal information is consistent and can be
easily accessed by authorized personnel or key business stakeholders.

A data model is created by examining how the information currently exists, identifying
the entities within the system, and determining where they fit in relation to each other.

NEELIMA pg. 3
DAM UNIT - V BA III YEAR
It’s similar to an organizational chart, but instead of highlighting lines of authority, it
shows how information is organized.

Types of Data Models

Data modelers use a variety of techniques to create models. Though,
there are 3 main types of data models:
1. Conceptual Data Model
Conceptual data models are the foundation of every data model that’s created. They help
you understand which entities exist in your business and how they relate to each other.
Conceptual models don’t include the details regarding the specific attributes attached to
an entity.

A conceptual model is a diagram that describes what your business does and how things
work together. It’s a hierarchical view of entities and their relationships, and it’s usually
created to give stakeholders a broad overview of the database. Data modeling
tools can help you create a conceptual model for your database in no time at all.

2. Logical Data Model

Logical Data Model focuses on how data is stored in an organization’s systems.
The logical model describes how data moves between its source (for example, a person or
another system) and its destination (for example, a database). It uses entities, attributes,
relationships, cardinality, and constraints to describe the entity set for each table in a
relational database.

NEELIMA pg. 4
DAM UNIT - V BA III YEAR

The logical data model provides the foundation for creating physical data models. These
can be used to define tables in relational databases or objects in object-oriented
languages such as SQL, Java, or C++.

3. Physical Data Model

Physical data modeling is the process of defining the structure of a database schema
to store information. The physical model is typically created by a database administrator
or system analyst. It is used to create tables, indexes, and views, which are
implemented through the use of Structured Query Language (SQL) statements.

The simplest form of data modeling involves creating models that describe how data
should be stored in tables. These models are then implemented into one or more
databases. A more complex form of data modeling involves creating a logical model

NEELIMA pg. 5
DAM UNIT - V BA III YEAR
that describes how data will be accessed and manipulated by end-users and applications
that consume it.

Q) What is Data Modelling? Identify the role of data modelling

in the organization
A) Data modelling (data modelling) is the process of creating a data model
for the data to be stored in a database. This data model is a conceptual representation of
Data objects, the associations between different data objects, and the rules.

▪ Data modelling refers to the process of designing how data is stored and retrieved by
an organization. It analyzes the needs of the business, applies pretested templates (or
patterns) and best practices, and creates a database structure that can efficiently provide
the business with the information and insights it needs.

Data modeling is the process of transforming data into information.

Any information is useless unless delivered in a format that can be consumed by business
users. And data modeling helps in translating the requirements of business users into a
data model that can be used to support business processes and scale analytics.

Importance of Data Modeling

Here are some of the major importance of data modeling:

1. Organizes Data: Data modeling structures data in a logical and organized

manner, making it easier to understand and manage.
2. Improves Data Quality: Data modeling helps identify and rectify
inconsistencies and errors in data, leading to better data quality.
3. Ensures Data Integrity: Data modeling enforces constraints and relationships,
ensuring data integrity and preventing data anomalies.
4. Supports Decision Making: Well-designed data models provide valuable
insights and support informed decision-making processes.
5. Facilitates Database Design: Data modeling is a crucial step in database
design, helping create efficient and optimized database structures.
6. Reduces Redundancy: Data modeling minimizes data redundancy by
eliminating unnecessary duplication of information.
7. Simplifies Data Retrieval: A well-designed data model enables efficient and
quick data retrieval, improving system performance.
8. Enhances Application Development: Data models serve as a blueprint for
application development, making it easier to integrate data into software
solutions.
9. Enables Scalability: A robust data model supports future growth and
scalability, accommodating additional data without major disruptions.
10. Promotes Standardization: Data modeling promotes standardization and
consistency in data representation across the organization.

NEELIMA pg. 6
DAM UNIT - V BA III YEAR
11. Aids Data Governance: Data modeling facilitates data governance initiatives,
ensuring compliance with regulations and data management policies.
12. Supports Data Analysis: Data models provide a structured framework for data
analysis and reporting, enabling meaningful insights.
13. Encourages Collaboration: Data modeling encourages collaboration among
business analysts, developers, and stakeholders in the data modeling process.
14. Minimizes Development Errors: By defining data requirements upfront, data
modeling reduces errors during the development phase.
15. Long-term Investment: A well-maintained data model is a long-term
investment that provides value throughout the lifecycle of the data and
applications.

Types of Data Modelling / Type of Database models

Data modeling is a diagram of the logical structure of data within a database. Data
modeling can help people understand data better, and people using data to predict
future outcomes.

There are several different Database model types, some of them are old, while some of
them are new, to cater to the new age requirements. Here is a list of the 7 popular
Database models:

1. Hierarchical Model
2. Network Model
3. Entity-relationship Model
4. Relational Model
5. Object-oriented Model
6. NoSQL Model
7. Graph Model

Let's learn about the different types of database models along with their main features
and when should you use them.

1. Hierarchical Model
• The hierarchical database model organizes data into a tree-like structure, with
a single root, to which all the other data is linked.
• The hierarchy starts from the Root data, and expands like a tree,
adding child nodes to the parent nodes.
• In this model, a child node will only have a single parent node.
• This model efficiently describes many real-world relationships like the index of a
book, etc.
• IBM's Information Management System (IMS) is based on this model.
• Data is organized into a tree-like structure with a one-to-many
relationship between two different types of data, for example,
one department can have many courses, many teachers, and of course
many students(like shown in the diagram below).

NEELIMA pg. 7
DAM UNIT - V BA III YEAR

Advantages/Disadvantages of the Hierarchical Model

Here are a few points to mark the advantages and disadvantages of the Hierarchical
database model:

1. Because it has one-to-many relationships between different types of data so it is

easier and fast to fetch the data.
2. But the Hierarchical model is less flexible.
3. And it doesn't support many-to-many relationships.

2. Network Model
• The Network Model is an extension of the Hierarchical model.
• In this model, data is organized more like a graph, and allowed to have more than
one parent node.
• In the network database model, data is more related as more relationships are
established in this database model.
• Also, as the data is more related, hence accessing the data is also
easier and fast.
• This database model uses many-to-many data relationships.
• Integrated Data Store (IDS) is based on this database model.
• This was the most widely used database model before Relational Model was
introduced.
• The implementation of the Network model is complex, and it's very difficult to
maintain it.
• The Network model is difficult to modify also.
• You may want to explore this if you are developing some social networking
applications, although the Graph Database model is new and is far better than the
Network Database model.

NEELIMA pg. 8
DAM UNIT - V BA III YEAR

Advantages of the Network Model

1. It supports complex relationships

2. It allows more flexibility

3. Entity-relationship Model
• In this database model, relationships are created by dividing objects of interest
into entities and their characteristics into attributes.
• Different entities are related using relationships.
• ER Models are defined to represent the relationships in pictorial form to make it
easier for different stakeholders to understand.
• This model is good to design a database, which can then be turned into tables in a
relational model (explained below).
• Let's take an example, If we have to design a School Database, then
the Student will be an entity with attributes name, age, address, etc. As
an Address is generally complex, it can be
another entity with attributes street, pincode, city, etc, and there will be a
relationship between them.
• Relationships can also be of different types. You can learn about ER Diagrams in
detail if you want to learn about entities and relationships.

NEELIMA pg. 9
DAM UNIT - V BA III YEAR

Advantages of the ER Model

1. It is easy to understand and design.

2. Using the ER model we can represent data structures easily.
3. As the ER model cannot be directly implemented into a database model, it is just a
step toward designing the relational database model.

4. Relational Model
• In this model, data is organized in two-dimensional tables and the relationship is
maintained by storing a common field.
• This model was introduced by E.F Codd in 1970, and since then it has been the
most widely used database model.
• The basic structure of data in the relational model is tables. All the information
related to a particular type is stored in rows of that table.
• Hence, tables are also known as relations in the relational model.
• You can design tables, normalize them to reduce data redundancy,
and use Structured Query language or SQL to access data from the tables.
• Some of the most popular databases are based on this database model. For
example, Oracle, MySQL, etc.

NEELIMA pg. 10
DAM UNIT - V BA III YEAR

Advantages of the Relational Model

1. It's simple and easy to implement.

2. Poplar database software is available for this database model.
3. It supports SQL using which you can easily query the data.

5. Object-oriented Model
• In this model, data is stored in the form of objects.
• The behavior of the object-oriented database model is just like object-oriented
programming.
• A very popular example of an Object Database management system
or ODBMS is MongoDB which is also a NoSQL database.
• This database model is not mature enough as compared to the relational database
model.

NEELIMA pg. 11
DAM UNIT - V BA III YEAR
Advantages of the Object-oriented Model

1. It can easily support complex data structures, with relationships.

2. It also supports features like Inheritance, Encapsulation, etc.

6. NoSQL Model
• The NoSQL database model supports an unstructured style of storing data.
• Data is stored as documents.
• The documents look more like JSON strings or Key-value based object
representations.
• It provides a flexible schema.
• It does provide features like indexing, relationships between data, etc.
• The support for data querying is limited in the NoSQL database model.
• This database model is well-suited for Big data applications, real-time analytics,
CMS (Content Management systems), etc.

Advantages of the NoSQL Model

1. This database model is scalable.

2. This database model functions with high performance.
3. The NoSQL database model can handle large volumes of data.

7. Graph Model
• The Graph database model is based on more real-world like relationships.
• Data is represented using Nodes or entities.
• The nodes are related using edges.
• The popular database Neo4j is based on the Graph database model.

NEELIMA pg. 12
DAM UNIT - V BA III YEAR
• If your application has simple data requirements, then you should not use the
graph database model.
• In modern applications like social networks, recommendation systems, etc. the
graph database model is well-suited.

Advantages of the Graph Model

1. It handles complex relationships very well.

2. In the modern world where there is so much data and the data has to be related in
different ways, the graph database model is very useful.

Q) What is Structured data?

A) Structured data refers to data that is organized and formatted in a specific
way to make it easily readable and understandable by both humans and
machines.
Structured data is typically found in databases and spreadsheets, and is
characterized by its organized nature.
Examples of structured data formats include relational databases, XML, and
JSON.

Definable attributes
Structured data has the same attributes for all data values. For example, every booking record
could have these attributes: booking name, event name, event date, and booking amount.
Relational attributes
Structured data tables have common values that link different datasets together. For
example, you can relate customer data with booking data by using customer
id and booking id fields. So, you can store structured data conveniently in a relational
database.

NEELIMA pg. 13
DAM UNIT - V BA III YEAR
Quantitative data
Structured data lends well to mathematical analysis. For example, you can count and measure the
frequency of attributes and perform mathematical operations on numerical data.
Storage
You can store structured data in relational databases and manage it using structured query
language (SQL). SQL lets you define a data model called a schema under which you determine
preset rules—such as fields, formats, and values—for your data. You can then store structured
data in data warehouses or other relational database technology.
Structured data examples
Here are examples of structured data systems:

• Excel files
• SQL databases
• Point-of-sale data
• Web form results
• Search engine optimization (SEO) tags
• Product directories
• Inventory control
• Reservation systems

The benefits of structured data.

There are several benefits of using structured data.
Ease of use
Anyone can quickly comprehend and access structured data. Operations such as updating and
amending structured data are straightforward. Storage is efficient, as fixed-length storage units
can be allocated for data values.
Scalability
Structured data scales algorithmically. You can add storage and processing power as your data
volume increases. Modern systems that process structured data can scale to several thousand TB
of data.
Analytics
Machine learning algorithms can analyze structured data and identify common patterns for
business intelligence. You can use structured query language (SQL) to generate reports as well as
modify and maintain data. Structured data is also useful for big data analytics.

Q) List out some structured data tools and Un Structured data

tools.

Structured DataBase (Relational Un Structured Database (Non-

Database) Relational Database)

NEELIMA pg. 14
DAM UNIT - V BA III YEAR
1.Oracle Database 1. Cassandra
Oracle Database is a proprietary multi- Originally developed by Facebook, this
model database management system NoSQL database is now managed by the
produced and marketed by Oracle Apache Foundation. It’s used by many
Corporation. It is a database commonly organizations with large, active datasets,
used for running online transaction including Netflix, Twitter, Urban Airship,
processing, data warehousing and mixed Constant Contact, Reddit, Cisco and Digg.
database workload. Currently in edition Commercial support and services are
23c, offers native support for property available through third-party vendors.
graph data structures and graph queries. Operating System: OS Independent.

2.PostgreSQL is a powerful and reliable 2. HBase

database management system that Another Apache project, HBase is the non-
provides a wide range of features and relational data store for Hadoop. Features
benefits for businesses and organizations. include linear and modular scalability,
strictly consistent reads and writes,
automatic failover support and much more.
Operating System: OS Independent.

3.MySQL 3. MongoDB
MySQL is a free and open-source MongoDB was designed to support
relational database management system humongous databases. It’s a NoSQL
(RDBMS) that is widely used in web database with document-oriented storage,
applications and used by businesses and full index support, replication and high
individuals of all sizes and industries. availability, and more. Commercial support
is available through 10gen. Operating
system: Windows, Linux, OS X, Solaris.

4.Amazon Aurora? Amazon Aurora is a 4. Neo4j

MySQL and PostgreSQL-compatible The “world’s leading graph database,” Neo4j
relational database engine that combines boasts performance improvements up to
the speed and availability of high-end 1000x or more versus relational databases.
commercial databases with the simplicity Interested organizations can purchase
and cost-effectiveness of open source advanced or enterprise versions from Neo
databases. Technology. Operating System: Windows,
Linux.

5.Microsoft SQL Server 5. CouchDB

SQL Server 2017 brings the power of SQL Designed for the Web, CouchDB stores data
Server to Windows, Linux and Docker in JSON documents that you can access via
containers for the first time ever, enabling the Web or query using JavaScript. It offers
developers to build intelligent applications distributed scaling with fault-tolerant
using their preferred language and storage. Operating system: Windows, Linux,
environment. Experience industry- OS X, Android.
leading performance, rest assured with
innovative security features, transform
your business with AI built-in, and deliver
insights wherever your users are with
mobile BI.

NEELIMA pg. 15
DAM UNIT - V BA III YEAR
6.Google Cloud SQL 6. FlockDB
Cloud SQL is a fully-managed managed Best known as Twitter’s database, FlockDB
relational database service for MySQL, was designed to store social graphs (i.e., who
PostgreSQL, and SQL Server with rich is following whom and who is blocking
extension collections, configuration flags, whom). It offers horizontal scaling and very
and developer ecosystems. fast reads and writes. Operating System: OS
Independent.

7.Azure SQL DatabaseAzure SQL 7. OrientDB

Database is a relational database-as-a This NoSQL database can store up to
service using the Microsoft SQL Server 150,000 documents per second and can load
Engine. SQL Database is a high- graphs in just milliseconds. It combines the
performance, reliable, and secure flexibility of document databases with the
database you can use to build data-driven power of graph databases, while supporting
applications and websites in the features such as ACID transactions, fast
programming language of your choice, indexes
without needing to manage infrastructure

8.IBM Db2 8. Terrstore

Built to run the world’s mission-critical Based on Terracotta, Terrastore boasts
workloads. Designed by the world’s “advanced scalability and elasticity features
leading database experts, IBM Db2 without sacrificing consistency.” It supports
empowers developers, enterprise custom data partitioning, event processing,
architects, and data engineers to run low- push-down predicates, range queries,
latency transactions and real-time map/reduce querying and processing and
analytics equipped for the most server-side update functions. Operating
demanding workloads. From System: OS Independent.
microservices to AI workloads, Db2 is the
tested, resilient, and hybrid database
providing the extreme availability, built-in
refined security, effortless scalability, and
intelligent automation for systems that
run the world.

9.Microsoft Access 9. Hibari

Microsoft Access is a database Used by many telecom companies, Hibari is
management system from Microsoft that a key-value, big data store with strong
combines the relational Microsoft Jet consistency, high availability and fast
Database Engine with a graphical user performance. Support is available through
interface and software-development tools. Gemini Mobile. Operating System: OS
Independent.

10.Oracle TimesTen? 10. Riak

Oracle TimesTen In-Memory Database is Riak is “the most powerful open-source,
a full-featured relational database thats distributed database you’ll ever put into
designed to run in the application tier and production.” Users include Comcast,
store all data in main memory. This makes Yammer, Voxer, Boeing, SEOMoz, Joyent,
the reading or writing of data as simple Kiip.me, DotCloud, Formspring, the Danish
and fast as accessing RAM. Government and many others. Operating
System: Linux, OS X.

NEELIMA pg. 16
DAM UNIT - V BA III YEAR

Q) What is Unstructured Data?

A) Unstructured data is information that is not arranged according to a preset data

model or schema, and therefore cannot be stored in a traditional relational database or
RDBMS. Text and multimedia are two common types of unstructured content. Many
business documents are unstructured, as are email messages, videos, photos, webpages,
and audio files.

some examples of unstructured data

Unstructured data can be created by people or generated by machines.

• Email: Email message fields are unstructured and cannot be parsed by traditional
analytics tools. That said, email metadata affords it some structure, and explains
why email is sometimes considered semi-structured data.
• Text files: This category includes word processing documents, spreadsheets,
presentations, email, and log files.
• Social media and websites: data from social networks like Twitter, LinkedIn, and
Facebook, and websites such as Instagram, photo-sharing sites, and YouTube.
• Mobile and communications data: For this category, look no further than text
messages, phone recordings, collaboration software, chat, and instant messaging.
• Media: This data includes digital photos, audio, and video files.
• Scientific data: This includes oil and gas surveys, space exploration, seismic
imagery, and atmospheric data.
• Digital surveillance: This category features data like reconnaissance photos and
videos.
• Satellite imagery: This data includes weather data, land forms, and military
movements.

Unstructured data characteristics

• Various formats stored in native file format
• Qualitative
• Stored in data lakes or non-relational databases or (i.e., NoSQL databases)
• Impossible for people to search, requiring processing for algorithms to
understand
• Requires more storage space than structured data
• Schema-on-read
• Requires data science expertise

NEELIMA pg. 17
DAM UNIT - V BA III YEAR
Benefits of unstructured data

• Data is stored in native format, which provides access to a wider variety of more
adaptable data.
• Data accumulation rates are faster, because anything can be collected without the
limitation of predefining the data.
• Option to store data in cloud data lakes that offer massive storage.

Q) List out to perform CRUD tasks in SQL against a database

A) CRUD operations:

CRUD Operations in SQL

As we know, CRUD operations act as the foundation of any computer programming
language or technology. So before taking a deeper dive into any programming language or
technology, one must be proficient in working on its CRUD operations. This same rule
applies to databases as well.

Let us start with the understanding of CRUD operations in SQL with the help of examples.
We will be writing all the queries in the supporting examples using the MySQL database.

1. Create:
In CRUD operations, 'C' is an acronym for create, which means to add or insert data into
the SQL table. So, firstly we will create a table using CREATE command and then we will
use the INSERT INTO command to insert rows in the created table.

NEELIMA pg. 18
DAM UNIT - V BA III YEAR

Syntax:
CREATE TABLE Table_Name (ColumnName1 Datatype, ColumnName2 Data
type, , ColumnNameN Datatype);

2. Read:
In CRUD operations, 'R' is an acronym for read, which means retrieving or fetching
the data from the SQL table. So, we will use the SELECT command to fetch the
inserted records from the SQL table. We can retrieve all the records from a table using an
asterisk (*) in a SELECT query. There is also an option of retrieving only those records
which satisfy a particular condition by using the WHERE clause in a SELECT query.
Syntax:
SELECT Column_Name_1, Column_Name_2, ....., Column_Name_N FROM
Table_Name;

3. Update:
In CRUD operations, 'U' is an acronym for the update, which means making updates
to the records present in the SQL tables. So, we will use the UPDATE command to
make changes in the data present in tables.

Syntax:
UPDATE table_name SET [column_name1= value1,... column_nameN = valueN] [WH
ERE condition]

4. Delete:
In CRUD operations, 'D' is an acronym for delete, which means removing or deleting
the records from the SQL tables. We can delete all the rows from the SQL tables using
the DELETE query. There is also an option to remove only the specific records that satisfy
a particular condition by using the WHERE clause in a DELETE query.

Syntax:
DELETE FROM Table_Name WHERE condition;

NEELIMA pg. 19

Data Analytics for Beginners: Introduction to Data Analytics
From Everand
Data Analytics for Beginners: Introduction to Data Analytics
Anthony S. Williams
4/5 (19)
Woman-Centered Coaching Revolution - Lesson 1 - Handout
No ratings yet
Woman-Centered Coaching Revolution - Lesson 1 - Handout
28 pages
BBA Banking Assignment 2024-25
No ratings yet
BBA Banking Assignment 2024-25
1 page
Dam Unit - Iii
No ratings yet
Dam Unit - Iii
17 pages
1 Unit and 2 Unit SCM
No ratings yet
1 Unit and 2 Unit SCM
26 pages
Business Economics Unit 3
No ratings yet
Business Economics Unit 3
11 pages
III Semester QB
No ratings yet
III Semester QB
9 pages
UNIT-2 Auditing
No ratings yet
UNIT-2 Auditing
8 pages
Mbis20 Course Information Set 20200721 v2
No ratings yet
Mbis20 Course Information Set 20200721 v2
9 pages
Data Analytics and Data Processing Essentials
From Everand
Data Analytics and Data Processing Essentials
gareth thomas
No ratings yet
(Campus of Open Learning) University of Delhi Delhi-110007
No ratings yet
(Campus of Open Learning) University of Delhi Delhi-110007
1 page
TM Series Data Sheet 1
No ratings yet
TM Series Data Sheet 1
2 pages
Important Questions
No ratings yet
Important Questions
21 pages
User's Manual: Sun-Odn-F
No ratings yet
User's Manual: Sun-Odn-F
4 pages
Computer Aided Drug Design PPT 5
No ratings yet
Computer Aided Drug Design PPT 5
1 page
Practice Questions On Loops in Java
No ratings yet
Practice Questions On Loops in Java
6 pages
Trigonometry 15 Dec1.
No ratings yet
Trigonometry 15 Dec1.
107 pages
Data Analytics. Fast Overview.
From Everand
Data Analytics. Fast Overview.
George Letton
2.5/5 (19)
Logan Keylock - Term 2 Marketing Task 2024
No ratings yet
Logan Keylock - Term 2 Marketing Task 2024
4 pages
Health Informatics Quiz 1-6
No ratings yet
Health Informatics Quiz 1-6
11 pages
U CMR March 2023
80% (5)
U CMR March 2023
2 pages
NVIDIA DGX SuperPOD With DGX GB200 Systems
No ratings yet
NVIDIA DGX SuperPOD With DGX GB200 Systems
3 pages
전력수급 비상하에서 배전전압 조정시 전력계통 영향평가 - 전남대학교
No ratings yet
전력수급 비상하에서 배전전압 조정시 전력계통 영향평가 - 전남대학교
272 pages
DA Unit 1
No ratings yet
DA Unit 1
33 pages
21CS71 Imp
No ratings yet
21CS71 Imp
29 pages
Data Science Notes
No ratings yet
Data Science Notes
56 pages
AFDM UNIT 2 Notes
No ratings yet
AFDM UNIT 2 Notes
29 pages
Business Analytics and Big Data
From Everand
Business Analytics and Big Data
Sachin Naha
No ratings yet
TT1285-Instruction Manual
No ratings yet
TT1285-Instruction Manual
20 pages
Group 8 - CHAPTER 8 - Project TIM
No ratings yet
Group 8 - CHAPTER 8 - Project TIM
18 pages
Chapter Two
No ratings yet
Chapter Two
57 pages
A Review of Evaporative Cooling Technologies
No ratings yet
A Review of Evaporative Cooling Technologies
8 pages
Group 5 Presentation
No ratings yet
Group 5 Presentation
15 pages
Vivek 1
No ratings yet
Vivek 1
91 pages
DS Xi Sec4
No ratings yet
DS Xi Sec4
49 pages
DATA ANALYSIS - Full - Note - Immersive 2
No ratings yet
DATA ANALYSIS - Full - Note - Immersive 2
13 pages
BMS Interfacing Points Checklist
100% (1)
BMS Interfacing Points Checklist
3 pages
KCA 034 - Unit 1
No ratings yet
KCA 034 - Unit 1
48 pages
List of Obcs in Tripura As Approved by The Govt. of India. Schemes For Welfare of O.B.Cs
No ratings yet
List of Obcs in Tripura As Approved by The Govt. of India. Schemes For Welfare of O.B.Cs
4 pages
Ict Ch. 2
No ratings yet
Ict Ch. 2
38 pages
Thayer, Vice President Kamala Harris Visit To Vietnam Scene Setter
No ratings yet
Thayer, Vice President Kamala Harris Visit To Vietnam Scene Setter
3 pages
Data Analytics Complete Notes
No ratings yet
Data Analytics Complete Notes
33 pages
Unit 1
No ratings yet
Unit 1
21 pages
"Presentation ": Information
No ratings yet
"Presentation ": Information
12 pages
DAVAI Macro
No ratings yet
DAVAI Macro
6 pages
Data Visulaziation
No ratings yet
Data Visulaziation
42 pages
Dav 1 Unit
No ratings yet
Dav 1 Unit
30 pages
Copy of Copy of LOCAL BIRTH CERTIFICATE - 20250116 - 135004 - 0000.pdf - 20 - 20250221 - 121021 - 0000
No ratings yet
Copy of Copy of LOCAL BIRTH CERTIFICATE - 20250116 - 135004 - 0000.pdf - 20 - 20250221 - 121021 - 0000
4 pages
Border Irrigation: Advantages
No ratings yet
Border Irrigation: Advantages
8 pages
Ch1imp 1
No ratings yet
Ch1imp 1
24 pages
Da Mod 1
No ratings yet
Da Mod 1
60 pages
Unit 1ppt
No ratings yet
Unit 1ppt
29 pages
BED 311 Lecture Notes Units 2 - 6
No ratings yet
BED 311 Lecture Notes Units 2 - 6
16 pages
What Is Data Analytics
No ratings yet
What Is Data Analytics
12 pages
Week 1
No ratings yet
Week 1
50 pages
DA Unit 1
No ratings yet
DA Unit 1
43 pages
Introduction To Data Science Module 2
No ratings yet
Introduction To Data Science Module 2
35 pages
Manual de Operacion BBC 16
No ratings yet
Manual de Operacion BBC 16
184 pages
IDFL Standards - European Sleeping Bag Labeling Info EN13537 Information For Consumers Jan 05
No ratings yet
IDFL Standards - European Sleeping Bag Labeling Info EN13537 Information For Consumers Jan 05
5 pages
Puente Arizona Et Al v. Arpai Arizona MOTION For Summary Judgment
100% (1)
Puente Arizona Et Al v. Arpai Arizona MOTION For Summary Judgment
31 pages
Relationship Between Data and Information
No ratings yet
Relationship Between Data and Information
3 pages
Unitwise Imp Notes
No ratings yet
Unitwise Imp Notes
34 pages
PDF Afar Week1 Compiled Questions Compress
No ratings yet
PDF Afar Week1 Compiled Questions Compress
78 pages
DA Unit 2 Trio 1
No ratings yet
DA Unit 2 Trio 1
26 pages
Chapter 2. Introduction To Data Science
No ratings yet
Chapter 2. Introduction To Data Science
41 pages
Unit 1 Introduction To Data Analytics
No ratings yet
Unit 1 Introduction To Data Analytics
20 pages
Unit 1ppt 241202105748 Ba1c594f
No ratings yet
Unit 1ppt 241202105748 Ba1c594f
30 pages
Bba Syllabus 2007-08 (17th Batch)
50% (2)
Bba Syllabus 2007-08 (17th Batch)
2 pages
Data and Information
No ratings yet
Data and Information
22 pages
10 Vallarta v. CA
No ratings yet
10 Vallarta v. CA
2 pages
Charles Vaughner, Cross-Appellants v. F.J. Pulito, Cross-Appellee v. General Accident Insurance Company of America, the Camden Fire Insurance Association, Potomac Insurance Company of Illinois and Pennsylvania General Insurance Company, Third-Party, 804 F.2d 873, 3rd Cir. (1986)
No ratings yet
Charles Vaughner, Cross-Appellants v. F.J. Pulito, Cross-Appellee v. General Accident Insurance Company of America, the Camden Fire Insurance Association, Potomac Insurance Company of Illinois and Pennsylvania General Insurance Company, Third-Party, 804 F.2d 873, 3rd Cir. (1986)
9 pages
EI - Unit I
No ratings yet
EI - Unit I
13 pages
Unit-01 Varun Singh
No ratings yet
Unit-01 Varun Singh
34 pages
Example of Data: Name, Phone, Age, Number and Birthday
No ratings yet
Example of Data: Name, Phone, Age, Number and Birthday
6 pages
Unit II
No ratings yet
Unit II
6 pages
Chapter 2 Introduction To Data Science
No ratings yet
Chapter 2 Introduction To Data Science
50 pages
Chapter 2 - Data Science
No ratings yet
Chapter 2 - Data Science
57 pages
Data Analytics For IOT
No ratings yet
Data Analytics For IOT
57 pages
Unit 2 of AI
No ratings yet
Unit 2 of AI
5 pages
Food Packaging: Unit 1 - Metals
No ratings yet
Food Packaging: Unit 1 - Metals
22 pages
Data For Business Analytics Unit 2
No ratings yet
Data For Business Analytics Unit 2
23 pages
ACC IT APP MIdterm Bigdata
No ratings yet
ACC IT APP MIdterm Bigdata
12 pages
Data Analytics for Businesses 2019: Master Data Science with Optimised Marketing Strategies using Data Mining Algorithms (Artificial Intelligence, Machine Learning, Predictive Modelling and more)
From Everand
Data Analytics for Businesses 2019: Master Data Science with Optimised Marketing Strategies using Data Mining Algorithms (Artificial Intelligence, Machine Learning, Predictive Modelling and more)
Riley Adams
5/5 (1)
Data-Driven Business Strategies: Understanding and Harnessing the Power of Big Data
From Everand
Data-Driven Business Strategies: Understanding and Harnessing the Power of Big Data
Steven Vollmer
No ratings yet
Moshi Moshi
No ratings yet
Moshi Moshi
25 pages
Unit 1
No ratings yet
Unit 1
19 pages
Data Science and Analytics: Transforming Raw Data into Actionable Insights: A Comprehensive Guide
From Everand
Data Science and Analytics: Transforming Raw Data into Actionable Insights: A Comprehensive Guide
Marlowe Reyes
No ratings yet
9/11 Commission Interview Requests For Defense Department Personnel
No ratings yet
9/11 Commission Interview Requests For Defense Department Personnel
6 pages
CS4670: Computer Vision: Lecture 5: Feature Detection and Matching
No ratings yet
CS4670: Computer Vision: Lecture 5: Feature Detection and Matching
46 pages
Final Full Notes Unit1 Data Analytics
No ratings yet
Final Full Notes Unit1 Data Analytics
41 pages
Data Analytics
100% (3)
Data Analytics
14 pages
PYTHON FOR DATA ANALYTICS: Mastering Python for Comprehensive Data Analysis and Insights (2023 Guide for Beginners)
From Everand
PYTHON FOR DATA ANALYTICS: Mastering Python for Comprehensive Data Analysis and Insights (2023 Guide for Beginners)
Waldo Todd
No ratings yet
Business Analytics: Leveraging Data for Insights and Competitive Advantage
From Everand
Business Analytics: Leveraging Data for Insights and Competitive Advantage
Ronald BLaha
No ratings yet
Describe The Data Processing Chain: Business Understanding
No ratings yet
Describe The Data Processing Chain: Business Understanding
4 pages
AI Unit 2 - Data & Algorithms by Kulbhushan (Krazy Kaksha & KK World)
No ratings yet
AI Unit 2 - Data & Algorithms by Kulbhushan (Krazy Kaksha & KK World)
5 pages
Analytics in a Business Context: Practical guidance on establishing a fact-based culture
From Everand
Analytics in a Business Context: Practical guidance on establishing a fact-based culture
Frank Vella
No ratings yet
What Is Business Analytics?: Satinderpal Kaur MBA3 (D)
No ratings yet
What Is Business Analytics?: Satinderpal Kaur MBA3 (D)
22 pages

Dam Unit - V

Uploaded by

Dam Unit - V

Uploaded by

DAM UNIT - V BA III YEAR

UNDERSTANDING DATA: Overview: Identify data roles in the organization -

Q) What is Data? Identify role of DATA in organization.

Q) Identify role of Data in the organization.

2. Improves customer satisfaction

4. Helps with problem-solving

5. Improves company processes

6. Resource Allocation: Data assists in the allocation of resources such as budget,

Q) What is Data Model? Explain types Of Data models

A) A data model is a blueprint that describes the internal structure of an organization’s

Types of Data Models

2. Logical Data Model

3. Physical Data Model

Q) What is Data Modelling? Identify the role of data modelling

Data modeling is the process of transforming data into information.

Importance of Data Modeling

1. Organizes Data: Data modeling structures data in a logical and organized

Types of Data Modelling / Type of Database models

Advantages/Disadvantages of the Hierarchical Model

1. Because it has one-to-many relationships between different types of data so it is

Advantages of the Network Model

1. It supports complex relationships

Advantages of the ER Model

1. It is easy to understand and design.

Advantages of the Relational Model

1. It's simple and easy to implement.

1. It can easily support complex data structures, with relationships.

Advantages of the NoSQL Model

1. This database model is scalable.

Advantages of the Graph Model

1. It handles complex relationships very well.

Q) What is Structured data?

The benefits of structured data.

Q) List out some structured data tools and Un Structured data

Structured DataBase (Relational Un Structured Database (Non-

2.PostgreSQL is a powerful and reliable 2. HBase

4.Amazon Aurora? Amazon Aurora is a 4. Neo4j

5.Microsoft SQL Server 5. CouchDB

7.Azure SQL DatabaseAzure SQL 7. OrientDB

8.IBM Db2 8. Terrstore

9.Microsoft Access 9. Hibari

10.Oracle TimesTen? 10. Riak

Q) What is Unstructured Data?

A) Unstructured data is information that is not arranged according to a preset data

some examples of unstructured data

Unstructured data characteristics

Q) List out to perform CRUD tasks in SQL against a database

CRUD Operations in SQL

You might also like