MSA8040 Lecture 1
MSA8040 Lecture 1
Swetha Siddhantam
Email: [email protected]
Office hours: Tuesday 11am – 1 pm
Be familiar with relational database concepts
Be proficient in manipulating data using SQL
Understand structured and unstructured data
Be familiar with MongoDB
Be able to extract, store and query unstructured data
Be able to recall and discuss algorithms for analysis of unstructured data
Be familiar with Python
Apply unstructured data analytics techniques to solve real problems
4
Overview of the Course:
Use of SQL in
ER / Design DB SQL Software / SQL
standard
NoSQL / ETC
Overview
Section 1: Section 2: Section Section 4: Section
Introducti MySQL 3: Web 5:
on NoSQL Scraping Text
Data Management for Analytics
MongoDB Selenium
ER model SQL Topic Modeling
CURD Navigating
ER diagrams’ Statement LDA
Aggregatio Locating
Normalization syntax Dynamic LDA
n elements
Sentiment analysis
Advanced SQL Twitter API Neural network
Relational model Procedure SVM
Trigger Decision tree
08/24/2023 MSA8040-I4I 6
Why Study Databases??
Computation Information
08/24/2023 MSA8040-I4I 7
Files and Databases
File: A collection of records or
documents
• Manual (paper) files
• Computer files
Database: A collection of similar
records, along with their relationships
What Is a Database??
Shared, integrated computer structures that store
data
Components:
• End-user data: raw facts of interest to end user
• Metadata: data about data, integrating & managing end-
user data
⎯ Describes data characteristics and relationships
⎯ Examples: the name of each data element, the type of values
(numeric, dates, or text) stored on each data element, and
whether the data element can be left empty
A Database management system (DBMS) is a
• Collection of programs
• Manages the database structure
• Controls access to data stored in the database
What Is a Database??
08/24/2023 MSA8040-I4I 10
Why Use a DBMS??
Minimal data redundancy
Data consistency, data integration,
and data sharing
Ease of application development,
reduced program maintenance
Uniform security, privacy, and
integrity controls
Data accessibility and responsiveness
Data independence
DBMS Functions
Data dictionary management
• Data dictionary: stores definitions of data
elements and their relationships
Data storage management
• Performance tuning ensures efficient
performance
Data transformation and presentation
• Data is formatted to conform to logical
expectations
Security management
• Enforces user security and data privacy
DBMS Functions
Multiuser access control
• Sophisticated algorithms ensure that
multiple users can access the database
concurrently without compromising its
integrity
Backup and recovery management
• Enables recovery of the database after a
failure
Data integrity management
• Minimizes redundancy and maximizes
consistency
DBMS Functions
Database access languages and
application programming interfaces
• Query language: lets the user specify what
must be done without having to specify how
• Structured Query Language (SQL): de facto
query language and data access standard
supported by the majority of DBMS vendors
Database communication interfaces
• Accept end-user requests via multiple,
different network environments
Data Modeling
Creating a specific data model before building your
databases.
08/24/2023 MSA8040-I4I 15
Data Models
A data model is a collection of concepts
for describing data
A schema is description of a particular
collection of data, using the a given data
model
The relational model of data is the most
widely used model today
• Main concept: relation, basically a table with
rows and columns
• Every relation has a schema, which describes
the columns, or fields
Evolution of Data Models
08/24/2023 MSA8040-I4I 17
NoSQL Databases
Usually very simple
key/value search
operations
May use distributed
parallel processing
(grid/cloud, e.g.
MongoDB + Hadoop)
Semantic Web
“TripleStores” are
one type
Well-designed DBs
Facilitate Data
Management
08/24/2023 MSA8040-I4I 19
Databases Make These
Folks HAPPY …
End users and DBMS vendors
DB application programmers
• E.g., smart webmasters
Database administrator (DBA)
• Designs logical /physical schemas
• Handle security and authorization
• Data availability, crash recovery
• Database tuning as needs evolve
Must understand how a DBMS works!
Structure of a DBMS
A typical DBMS has a
layered architecture query optimization
and execution