Dam Unit - V
Dam Unit - V
Data serves as the basis for generating information, insights, and knowledge when it is
processed, analyzed, and interpreted. In the context of organizations and decision-making,
data is a valuable asset that helps in understanding patterns, trends, and relationships,
ultimately leading to more informed and data-driven decisions.
These categories include:
Structured Data :
Structured data is a type of data that is organized and easily managed using traditional
data management tools such as spreadsheets, databases, or tables. Structured data is
typically quantitative and numeric in nature, meaning that it consists of numbers,
percentages, and other numerical values. Because of its organized nature, structured data
is relatively easy to analyze using statistical methods such as regression analysis or
correlation analysis.
Unstructured Data :
Unstructured data is data that does not have a predefined format or organization, making
it difficult to manage using traditional data management tools. Examples of unstructured
data include social media posts, emails, images, and videos. Because of its unstructured
nature, unstructured data is typically qualitative in nature, meaning that it is descriptive
and narrative in nature. Analyzing unstructured data requires the use of advanced
analytics techniques such as natural language processing (NLP) or sentiment analysis.
Semi-Structured Data :
Semi-structured data is a type of data that has elements of both structured and
unstructured data. This type of data includes information that is partially organized, but
not to the extent that it can be classified as structured data. Examples of semi-structured
data include XML and JSON files, which have some organization but also contain
elements of unstructured data. Analyzing semi-structured data typically requires a
combination of traditional data management tools and advanced analytics techniques.
Big data: Refers to massive data sets that need to be analyzed using advanced software
to reveal patterns and trends. It is considered to be one of the best analytical assets as it
provides larger volumes of data at a faster rate.
NEELIMA pg. 1
DAM UNIT - V BA III YEAR
Metadata: Putting it simply, metadata is data that provides insights about other data. It
summarizes key information about specific data that makes it easier to find and reuse for
later purposes.
Real time data: As its name suggests, real time data is presented as soon as it is
acquired. From an organizational perspective, this is the most valuable data as it can help
you make important decisions based on the latest developments. Our guide on real time
analytics will tell you more about the topic.
Machine data: This is more complex data that is generated solely by a machine such as
phones, computers, or even websites and embedded systems, without previous human
interaction.
Open data : Open data is data that is freely available to anyone in terms of its use (the
chance to apply analytics to it) and rights to republish without restrictions from
copyright, patents or other mechanisms of control. The Open Data Institute states
that open data is only useful if it’s shared in ways that people can actually understand. It
needs to be shared in a standardized format and easily traced back to where it came from.
One of the most compelling reasons to collect data for businesses is that data can help a
company make better decisions. Decisiveness can be a useful trait for a business, because
it can help the company make tough decisions more quickly and understand the
repercussions or benefits of decisions. For example, if a company wants to expand into a
new market, collecting data is a necessity because the company needs information on how
the market works, where it might fit into that market and what kinds of customers it might
serve once entrenched in that new market.
Customer satisfaction can help improve customer loyalty and trust, which may increase
sales and customer referrals. With raw data, businesses can study the effects of their efforts
on customer satisfaction and learn where they can improve. This can help the company
create a more pleasing, customized experience for each customer, helping to separate that
business from the competition. For example, a business might poll its customers with a
short digital survey after each purchase to ask questions about their experience. The
company can use that data to identify positive or negative trends and take action.
NEELIMA pg. 2
DAM UNIT - V BA III YEAR
3. Increases revenue and profits
Data may also help a company increase their revenue and profits by making the company
more efficient, providing key insights into operations and customer satisfaction and
helping to improve certain processes. Data can help businesses measure whether certain
actions, products or services are profitable and where their greatest expenses might be.
Identifying expenses is often the key to increasing profits because businesses can reduce
those expenses and keep more of the revenue they earn. Raw data helps the company
identify where it can trim expenses, increase efforts and earn more revenue.
Data also plays a critical role in problem-solving for company leaders. With an abundance
of data, company leaders can identify and address key problems and monitor the effects of
proposed solutions. For example, if a company identifies an issue in its manufacturing
processes, it might collect data on how much each unit costs to produce and how much
revenue they're losing with reduced production. Solving problems can be much easier and
solutions are more effective when the person solving that problem has sufficient
information. Understanding the problem in its entirety is typically the first step toward
solving that problem.
Company processes may also benefit from data, as it can show company leaders how
efficient or costly certain processes are. With this data, the company's executives can
determine how to make processes like manufacturing or customer service more efficient.
For example, a company might collect data on its marketing process and find that it's
allocating 50% of the marketing budget to social media campaigns that aren't generating
qualified leads. With this information, the company can determine whether to pull the
funding from that method and allocate it elsewhere for a better return on the investment
and continued financial savings.
8. Security and Fraud Detection: Data is crucial for identifying security threats and
detecting fraudulent activities within an organization's systems.
A data model is created by examining how the information currently exists, identifying
the entities within the system, and determining where they fit in relation to each other.
NEELIMA pg. 3
DAM UNIT - V BA III YEAR
It’s similar to an organizational chart, but instead of highlighting lines of authority, it
shows how information is organized.
A conceptual model is a diagram that describes what your business does and how things
work together. It’s a hierarchical view of entities and their relationships, and it’s usually
created to give stakeholders a broad overview of the database. Data modeling
tools can help you create a conceptual model for your database in no time at all.
NEELIMA pg. 4
DAM UNIT - V BA III YEAR
The logical data model provides the foundation for creating physical data models. These
can be used to define tables in relational databases or objects in object-oriented
languages such as SQL, Java, or C++.
The simplest form of data modeling involves creating models that describe how data
should be stored in tables. These models are then implemented into one or more
databases. A more complex form of data modeling involves creating a logical model
NEELIMA pg. 5
DAM UNIT - V BA III YEAR
that describes how data will be accessed and manipulated by end-users and applications
that consume it.
▪ Data modelling refers to the process of designing how data is stored and retrieved by
an organization. It analyzes the needs of the business, applies pretested templates (or
patterns) and best practices, and creates a database structure that can efficiently provide
the business with the information and insights it needs.
Any information is useless unless delivered in a format that can be consumed by business
users. And data modeling helps in translating the requirements of business users into a
data model that can be used to support business processes and scale analytics.
NEELIMA pg. 6
DAM UNIT - V BA III YEAR
11. Aids Data Governance: Data modeling facilitates data governance initiatives,
ensuring compliance with regulations and data management policies.
12. Supports Data Analysis: Data models provide a structured framework for data
analysis and reporting, enabling meaningful insights.
13. Encourages Collaboration: Data modeling encourages collaboration among
business analysts, developers, and stakeholders in the data modeling process.
14. Minimizes Development Errors: By defining data requirements upfront, data
modeling reduces errors during the development phase.
15. Long-term Investment: A well-maintained data model is a long-term
investment that provides value throughout the lifecycle of the data and
applications.
There are several different Database model types, some of them are old, while some of
them are new, to cater to the new age requirements. Here is a list of the 7 popular
Database models:
1. Hierarchical Model
2. Network Model
3. Entity-relationship Model
4. Relational Model
5. Object-oriented Model
6. NoSQL Model
7. Graph Model
Let's learn about the different types of database models along with their main features
and when should you use them.
1. Hierarchical Model
• The hierarchical database model organizes data into a tree-like structure, with
a single root, to which all the other data is linked.
• The hierarchy starts from the Root data, and expands like a tree,
adding child nodes to the parent nodes.
• In this model, a child node will only have a single parent node.
• This model efficiently describes many real-world relationships like the index of a
book, etc.
• IBM's Information Management System (IMS) is based on this model.
• Data is organized into a tree-like structure with a one-to-many
relationship between two different types of data, for example,
one department can have many courses, many teachers, and of course
many students(like shown in the diagram below).
NEELIMA pg. 7
DAM UNIT - V BA III YEAR
Here are a few points to mark the advantages and disadvantages of the Hierarchical
database model:
2. Network Model
• The Network Model is an extension of the Hierarchical model.
• In this model, data is organized more like a graph, and allowed to have more than
one parent node.
• In the network database model, data is more related as more relationships are
established in this database model.
• Also, as the data is more related, hence accessing the data is also
easier and fast.
• This database model uses many-to-many data relationships.
• Integrated Data Store (IDS) is based on this database model.
• This was the most widely used database model before Relational Model was
introduced.
• The implementation of the Network model is complex, and it's very difficult to
maintain it.
• The Network model is difficult to modify also.
• You may want to explore this if you are developing some social networking
applications, although the Graph Database model is new and is far better than the
Network Database model.
NEELIMA pg. 8
DAM UNIT - V BA III YEAR
3. Entity-relationship Model
• In this database model, relationships are created by dividing objects of interest
into entities and their characteristics into attributes.
• Different entities are related using relationships.
• ER Models are defined to represent the relationships in pictorial form to make it
easier for different stakeholders to understand.
• This model is good to design a database, which can then be turned into tables in a
relational model (explained below).
• Let's take an example, If we have to design a School Database, then
the Student will be an entity with attributes name, age, address, etc. As
an Address is generally complex, it can be
another entity with attributes street, pincode, city, etc, and there will be a
relationship between them.
• Relationships can also be of different types. You can learn about ER Diagrams in
detail if you want to learn about entities and relationships.
NEELIMA pg. 9
DAM UNIT - V BA III YEAR
4. Relational Model
• In this model, data is organized in two-dimensional tables and the relationship is
maintained by storing a common field.
• This model was introduced by E.F Codd in 1970, and since then it has been the
most widely used database model.
• The basic structure of data in the relational model is tables. All the information
related to a particular type is stored in rows of that table.
• Hence, tables are also known as relations in the relational model.
• You can design tables, normalize them to reduce data redundancy,
and use Structured Query language or SQL to access data from the tables.
• Some of the most popular databases are based on this database model. For
example, Oracle, MySQL, etc.
NEELIMA pg. 10
DAM UNIT - V BA III YEAR
5. Object-oriented Model
• In this model, data is stored in the form of objects.
• The behavior of the object-oriented database model is just like object-oriented
programming.
• A very popular example of an Object Database management system
or ODBMS is MongoDB which is also a NoSQL database.
• This database model is not mature enough as compared to the relational database
model.
NEELIMA pg. 11
DAM UNIT - V BA III YEAR
Advantages of the Object-oriented Model
6. NoSQL Model
• The NoSQL database model supports an unstructured style of storing data.
• Data is stored as documents.
• The documents look more like JSON strings or Key-value based object
representations.
• It provides a flexible schema.
• It does provide features like indexing, relationships between data, etc.
• The support for data querying is limited in the NoSQL database model.
• This database model is well-suited for Big data applications, real-time analytics,
CMS (Content Management systems), etc.
7. Graph Model
• The Graph database model is based on more real-world like relationships.
• Data is represented using Nodes or entities.
• The nodes are related using edges.
• The popular database Neo4j is based on the Graph database model.
NEELIMA pg. 12
DAM UNIT - V BA III YEAR
• If your application has simple data requirements, then you should not use the
graph database model.
• In modern applications like social networks, recommendation systems, etc. the
graph database model is well-suited.
Definable attributes
Structured data has the same attributes for all data values. For example, every booking record
could have these attributes: booking name, event name, event date, and booking amount.
Relational attributes
Structured data tables have common values that link different datasets together. For
example, you can relate customer data with booking data by using customer
id and booking id fields. So, you can store structured data conveniently in a relational
database.
NEELIMA pg. 13
DAM UNIT - V BA III YEAR
Quantitative data
Structured data lends well to mathematical analysis. For example, you can count and measure the
frequency of attributes and perform mathematical operations on numerical data.
Storage
You can store structured data in relational databases and manage it using structured query
language (SQL). SQL lets you define a data model called a schema under which you determine
preset rules—such as fields, formats, and values—for your data. You can then store structured
data in data warehouses or other relational database technology.
Structured data examples
Here are examples of structured data systems:
• Excel files
• SQL databases
• Point-of-sale data
• Web form results
• Search engine optimization (SEO) tags
• Product directories
• Inventory control
• Reservation systems
A)
NEELIMA pg. 14
DAM UNIT - V BA III YEAR
1.Oracle Database 1. Cassandra
Oracle Database is a proprietary multi- Originally developed by Facebook, this
model database management system NoSQL database is now managed by the
produced and marketed by Oracle Apache Foundation. It’s used by many
Corporation. It is a database commonly organizations with large, active datasets,
used for running online transaction including Netflix, Twitter, Urban Airship,
processing, data warehousing and mixed Constant Contact, Reddit, Cisco and Digg.
database workload. Currently in edition Commercial support and services are
23c, offers native support for property available through third-party vendors.
graph data structures and graph queries. Operating System: OS Independent.
3.MySQL 3. MongoDB
MySQL is a free and open-source MongoDB was designed to support
relational database management system humongous databases. It’s a NoSQL
(RDBMS) that is widely used in web database with document-oriented storage,
applications and used by businesses and full index support, replication and high
individuals of all sizes and industries. availability, and more. Commercial support
is available through 10gen. Operating
system: Windows, Linux, OS X, Solaris.
NEELIMA pg. 15
DAM UNIT - V BA III YEAR
6.Google Cloud SQL 6. FlockDB
Cloud SQL is a fully-managed managed Best known as Twitter’s database, FlockDB
relational database service for MySQL, was designed to store social graphs (i.e., who
PostgreSQL, and SQL Server with rich is following whom and who is blocking
extension collections, configuration flags, whom). It offers horizontal scaling and very
and developer ecosystems. fast reads and writes. Operating System: OS
Independent.
NEELIMA pg. 16
DAM UNIT - V BA III YEAR
• Email: Email message fields are unstructured and cannot be parsed by traditional
analytics tools. That said, email metadata affords it some structure, and explains
why email is sometimes considered semi-structured data.
• Text files: This category includes word processing documents, spreadsheets,
presentations, email, and log files.
• Social media and websites: data from social networks like Twitter, LinkedIn, and
Facebook, and websites such as Instagram, photo-sharing sites, and YouTube.
• Mobile and communications data: For this category, look no further than text
messages, phone recordings, collaboration software, chat, and instant messaging.
• Media: This data includes digital photos, audio, and video files.
• Scientific data: This includes oil and gas surveys, space exploration, seismic
imagery, and atmospheric data.
• Digital surveillance: This category features data like reconnaissance photos and
videos.
• Satellite imagery: This data includes weather data, land forms, and military
movements.
NEELIMA pg. 17
DAM UNIT - V BA III YEAR
Benefits of unstructured data
• Data is stored in native format, which provides access to a wider variety of more
adaptable data.
• Data accumulation rates are faster, because anything can be collected without the
limitation of predefining the data.
• Option to store data in cloud data lakes that offer massive storage.
A) CRUD operations:
Let us start with the understanding of CRUD operations in SQL with the help of examples.
We will be writing all the queries in the supporting examples using the MySQL database.
1. Create:
In CRUD operations, 'C' is an acronym for create, which means to add or insert data into
the SQL table. So, firstly we will create a table using CREATE command and then we will
use the INSERT INTO command to insert rows in the created table.
NEELIMA pg. 18
DAM UNIT - V BA III YEAR
Syntax:
CREATE TABLE Table_Name (ColumnName1 Datatype, ColumnName2 Data
type, , ColumnNameN Datatype);
2. Read:
In CRUD operations, 'R' is an acronym for read, which means retrieving or fetching
the data from the SQL table. So, we will use the SELECT command to fetch the
inserted records from the SQL table. We can retrieve all the records from a table using an
asterisk (*) in a SELECT query. There is also an option of retrieving only those records
which satisfy a particular condition by using the WHERE clause in a SELECT query.
Syntax:
SELECT Column_Name_1, Column_Name_2, ....., Column_Name_N FROM
Table_Name;
3. Update:
In CRUD operations, 'U' is an acronym for the update, which means making updates
to the records present in the SQL tables. So, we will use the UPDATE command to
make changes in the data present in tables.
Syntax:
UPDATE table_name SET [column_name1= value1,... column_nameN = valueN] [WH
ERE condition]
4. Delete:
In CRUD operations, 'D' is an acronym for delete, which means removing or deleting
the records from the SQL tables. We can delete all the rows from the SQL tables using
the DELETE query. There is also an option to remove only the specific records that satisfy
a particular condition by using the WHERE clause in a DELETE query.
Syntax:
DELETE FROM Table_Name WHERE condition;
NEELIMA pg. 19