0% found this document useful (0 votes)
35 views100 pages

KIT712lecture1 2019

This document provides an overview of the KIT712 Data Management Technology unit. It introduces the lecturer, Dr. Saurabh Kumar Garg, and discusses the unit content, learning activities, assessment, and resources. The unit covers topics such as entity relationship modeling, conceptual, logical, and physical database modeling, SQL, query optimization, database administration, and NoSQL databases. Assessment includes assignments, in-semester tests, and a final exam. Students are advised to actively participate in the unit, complete work on time, and seek help if needed.

Uploaded by

rajibcqu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views100 pages

KIT712lecture1 2019

This document provides an overview of the KIT712 Data Management Technology unit. It introduces the lecturer, Dr. Saurabh Kumar Garg, and discusses the unit content, learning activities, assessment, and resources. The unit covers topics such as entity relationship modeling, conceptual, logical, and physical database modeling, SQL, query optimization, database administration, and NoSQL databases. Assessment includes assignments, in-semester tests, and a final exam. Students are advised to actively participate in the unit, complete work on time, and seek help if needed.

Uploaded by

rajibcqu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 100

KIT712

Data Management Technology


LECTURE 1 BY
DR. SAURABH KUMAR GARG
About Me
Distrib
uted
Optimis
Present ation

 Lecturer at UTAS Decisi


Cloud
Comput
on BigDat
Past ing Maki a

ng
 B.Tech/M.Tech from Indian Institute of
Technology (IIT), Delhi Data Stream
Analyti
cs Computing
 PhD from the University of Melbourne MapReduce
 Postdoctoral Fellow at IBM Research

Education Analytics
Acoustic Data Analytics
Privacy
Teaching Team

 Unit Coordinator
 Dr. Saurabh Garg
 Lecturer in Hobart
 Office: Cent 462 (ICT wing)
 Email: [email protected]
 Lecturer in Launceston
 Dr Son Tran
 Office:V112
Introduction to KIT712

 Motivation
 Unit Content
 Learning Activities
 Assessment
 Resources
 Tips for Success
Motivation

What is Data toYou?


Our Data-driven World

 Science
 Data bases from astronomy, genomics, environmental data, transportation data, …
 Humanities and Social Sciences
 Scanned books, historical documents, social interactions data, new technology like GPS …
 Business & Commerce
 Corporate sales, stock market transactions, census, airline traffic, …
 Entertainment
 Internet images, Hollywood movies, MP3 files, …
 Medicine
 MRI & CT scans, patient records, …
How to Store Data?
Do we want to store like this?

Need Organization????
Need Organization
Data models
Why you are doing this unit?
Let us be honest!!

 It is compulsory unit. I am here only for Permanent Residency (PR)


 Consequence: Always distressed and may fail in the end
 It is compulsory unit. I am forced to do this unit.
 Consequence: Hard struggle and may pass
 I will learn industry based technology and improve my skills
 Consequence: Will enjoy the unit and will be eager for inputs
Unit Outline

 Very Important
 Soft copy available on
 MyLO under Information
 Online at
http://www.utas.edu.au/computing-information-systems/resources/unit-
outlines/
 Read It
Teaching Pattern

 This unit has:


 Lectures – 1 hour per week (Tuesday 9am-10am)(Except Lecture 1 and 13)
 Tutorials – 2 hours per week
 Online learning modules –up to 2 hours per week on average
 Self-Study 3 hours per week
 Work on assignments
 Prepare for tutorials
 Study/Revision/Self check quizzes
Prerequisites (Assumptions)

 Basic SQL
 Basic Programming
 Hard Working
Textbooks (Reference Only)

 Database Systems: Design, Implementation and Management, by Cornell and Morris,


Cengage Learning publisher.


Oracle 11g: SQL, by Joan Casteel, published 2010 by Course Technology, Cengage
Learning


Oracle 11g: PL/SQL, by Joan Casteel, published 2010 by Course Technology, Cengage
Learning
Learning Outcomes

LO1. Evaluate, critically analyse alternative techniques and data models for designing databases;
LO2. Adapt and apply techniques and processes for designing, implementing and administering an
enterprise level relational database;
LO3. Design sophisticated SQL queries to efficiently retrieve information from relational databases;
LO4. Understand and appreciate data storage and retrieval issues with current trends and advances in
database technologies.
Topics

 Introduction to Systems and Databases;


LO1
 Entity Relationship Model review and extension;
 Conceptual, logical, physical Modelling;
LO2
 SQL Review and advanced SQL;
LO3 LO4
 SQL Query Optimisation;
 Triggers, Procedures and Functions;
LO2
 Database Administration.
 Overview of NoSql Databases. LO1
Tentative Tutorial Schedules

 Week 2: ER Modelling and Relational Model LO1


 Week 3: Relational Algebra
 Week 4: SQL Revision using Oracle

LO2
Week 5: TEST (SQL Assignment due)
 Week 6: SQL Query Optimisation I
 Week 7: SQL Query Optimisation II
 Week 8: Lab Test LO3 LO4
 Week 9: PL/SQL I
 Week 10: PL/SQL II LO2
 WeeK 11: Lab TEST
 Week 12: Database Administration 1
 Week 13: Lab Test
LO1
Online Modules

 Data Modelling
 SQL Review (Oracle)
 Advanced SQL (PL/SQL, Triggers, Cursors)
 Query Optimizations
 Database Administration
Assessment - Overview

 In-Semester Assessment 60%


 Final Exam 40%
 To pass this unit need at least:
 Pass all the learning outcomes
 50% of the overall mark
Assessment - In-Semester

 60% of overall mark


 MUST gain at least 45% of the total mark in this part to pass the unit
 Assignments - tasks published on MyLO (10%)
 Database Design
 In- Semester Tests (conducted in the tutorial)
 Database Implementation (15%)
 Query Design and Optimization (15%)
 Database Constraints Implementation (PL/SQL) (15%)
 Database Administration (5%)
Assessment - Assignments

 For the assignment 1, you are allowed to work in group of upto


two.
 You have to find group members yourself
 The team members should be from same tutorial
 You may discuss the assignment specification with other students, and you
may ask for help with learning the material covered in the unit, but you must
not submit work which has been done by another person
 If you give your work to another student and that student submits
that work, then s/he and you are both guilty of Academic Misconduct
Plagiarism…..

 http://www.utas.edu.au/student-learning/for-students
 Using words, ideas, computer code, or any work by someone else without giving proper credit is
academic dishonesty.
 Academic dishonesty is often referred to as plagiarism.
 While studying at University you are expected to submit work that is your own.
 The intentional copying of someone else’s work as one’s own is a serious offence punishable by
penalties that may range from a fine or deduction/cancellation of marks and, in the most serious of
cases, to exclusion from a unit, a course or the University.
Assessment – Assignments
- Late Penalties

• From the CIS Late Assessment Policy:


<paste>
– Up to 24 hours after the due date. The assignment will be marked in the usual way and
the mark recorded will be 80% of the actual mark obtained.
– More than 24 hours and up to 7 days after the due date. The assignment will be marked
in the usual way and the mark recorded will be 50% of the actual mark obtained
– Later than 7 day after the due date – the assignment will not be marked.
</paste>
Resources
http://www.utas.edu.au/engineering-ict/current-student-resources
Assessment – Assignments
- Extensions

 If you want / need an extension for an assignment, you must


complete the Extension Form available on the CIS Resources
webpage http://www.utas.edu.au/engineering-ict/current-student-resources, and provide
suitable supporting documentation
 If possible, apply for an extension before the assignment is due
Assessment – Final Exam

 40% of overall mark


 At the end of semester during University Examination Period
 Will cover whole semester's work
Resources – MyLO

• KIT712 Data Management Technology Unit Home Page


– Content
– Lecture Slides
– Online Modules
– Tutorial Sheets
– Assessment
– Information
• Unit Outline
• Academic Integrity
– Announcements
Resources - General

 Virtual Machines on Each Desktop


 School of Computing Resources
 Help Desk
 Consultation Times
 University Resources
 Services & Support
http://www.students.utas.edu.au/
Resources - Fellow Students

 Discuss topics
 Work together on tutorial exercises etc
 Do not believe everything that other students tell you
 Remember that assignment work must be the individual work of the
student who submits it
 Do not email your assignment work to other students
 Do not edit other students’ assignment work
Tips for Success - Overview

Actively participate in the unit


Do the work in the unit as it falls due
If
you get into difficulties, seek help as soon as
possible
Tips for Success - Tutorials

 Try tutorial problems and study lecture slides


before coming to tutorials
 Actively participate in set activities
 Followup after
(complete activities if necessary)
Tips for Success – Private Study

– Keep up to date
– Follow up on problems/ questions from lectures
– Complete tutorial activities
– Read assignment specifications as soon as they are issued
– Seek help as soon as possible
What do we expect from you?

 Regular attendance of lectures:


 Pay full attention, be enthusiastic, fully committed to learn new things, ask questions during the
class, participate in discussions

 Study Lecture Slides before coming to tutorials

 Start on assignments as soon as they are announced

 If you have some problem with the lecturer/lectures/unit/??, please discuss with me
early.
 Don’t take out your frustrations on me during eValuate
Database Systems
Data vs. Information

Data Information
 Raw facts
 Produced by processing data
 Raw data - Not yet been processed to reveal
the meaning  Reveals the meaning of data
 Building blocks of information  Enables knowledge creation
 Data management  Should be accurate, relevant, and timely to
 Generation, storage, and retrieval of data enable good decision making
What is a Database?

Shared, integrated computer structure that stores a collection


of:
End-user data - Raw facts of interest to end user
Metadata: Data about data, which the end-user data are integrated
and managed
Describe data characteristics and relationships
Database management system (DBMS)

 Manages the database structure


 Collection of programs
 Controls access to data stored in the database
The DBMS Manages the Interaction between the End
User and the Database
Role of the DBMS

 Intermediary between the user and the database


 Enables data to be shared
 Presents the end user with an integrated view of the data
 Receives and translates application requests into operations required
to fulfill the requests
 Hides database’s internal complexity from the application programs
and users
Advantages of the DBMS

 Better data integration and less data inconsistency


 Data inconsistency: Different versions of the same data appear in different places
 Increased end-user productivity
 Improved:
 Data sharing
 Data security
 Data access
 Decision making
 Data quality: Promoting accuracy, validity, and timeliness of data
Types of Databases: User Count

 Single-user database: Supports one user at a time


 Desktop database: Runs on PC

 Multiuser database: Supports multiple users at the same time


 Workgroup databases: Supports a small number of users or a specific department
 Enterprise database: Supports many users across many departments
Types of Databases: Location

 Centralized database: Data is located at a single site

 Distributed database: Data is distributed across different sites

 Cloud database: Created and maintained using cloud data services that provide
defined performance measures for the database
Types of Databases: Data Subject

 General-purpose databases: Contains a wide variety of data used in multiple


disciplines

 Discipline-specific databases: Contains data focused on specific subject areas


Types of Databases: Support

 Operational database: Designed to support a company’s day-to-day


operations
 Analytical database: Stores historical data and business metrics used
exclusively for tactical or strategic decision making
 Data warehouse: Stores data in a format optimized for decision support
Types of Databases: Types of Data

 Unstructured data: It exists in their original state


 Structured data: It results from formatting
 Structure is applied based on type of processing to be performed
 Semistructured data: Processed to some extent
 Extensible Markup Language (XML)
 Represents data elements in textual format

Database Life Cycle


Six Phases

 Database initial study


 Database design
 Implementation and loading
 Testing and evaluation
 Operation
 Maintenance and evolution
The Database Initial Study

 Overall purpose:
 Analyze company situation
 Define problems and constraints Help in
 getting
Define objectives
Business
 Define scope and boundaries Rules
 Interactive and iterative processes required to complete first phase
of DBLC successfully
The Database Initial Study (cont’d.)

 Analyze the company situation


 General conditions in which company operates, its organizational
structure, and its mission
 Discover what company’s operational components are, how they
function, how information flows between them and how they
interact
The Database Initial Study (cont’d.)

 Define problems and constraints


 Formal and informal information sources
 Finding precise answers is important
 Accurate problem definition does not always yield a
solution
The Database Initial Study (cont’d.)

 Database system objectives must correspond to those envisioned by end users


 What is proposed system’s initial objective?
 Will system interface with other systems in the company?
 Will system share data with other systems or users?
 Scope: extent of design according to operational requirements
 Boundaries: limits external to system
Database Design

 Most critical phase


 Necessary to concentrate on data characteristics required to build database model
 Makes sure final product meets requirements
 Two views of data within system:
 Business view
 Data as information source
 Designer’s view
 Data structure, access, and activities required to transform data into information
Database design process

 Create Conceptual design


 Analysis of business rules  Create Logical design
 Entity relationship modeling
 iterative process with verification

 Determine DBMS
 Cost
 DBMS features and tools
 Physical design  Underlying model
 Portability
 DBMS hardware requirements
Implementation and Loading

 Install DBMS
 Creating a Database
 Load and convert the Data
 Other issues
 Performance
 Security
 Backup and recovery
 Integrity
 Company standards
 Concurrency controls
Testing and Evaluation

 Database is tested and fine-tuned for performance, integrity, concurrent


access, and security constraints
 Occurs in parallel with applications programming
 Database tools used to prototype applications
 If implementation fails to meet some of system’s evaluation criteria:
 Fine-tune specific system and DBMS configuration parameters
 Modify physical or logical design
 Upgrade software and/or hardware platform
Testing and Evaluation
(cont’d.)

 Integrity
 Enforced via proper use of primary, foreign key rules
 Backup and Recovery
 Full backup
 Differential backup
 Transaction log backup
Operation

 Once database has passed evaluation stage, it is considered operational


 Beginning of operational phase starts process of system evolution
 Problems not foreseen during testing surface
 Solutions may include:
 Load-balancing software to distribute transactions among multiple computers
 Increasing available cache
Maintenance and Evolution

 Required periodic maintenance:


 Preventive maintenance (backup)
 Corrective maintenance (recovery)
 Adaptive maintenance
 Assignment of access permissions and their maintenance for new and old users
 Generation of database access statistics
 Periodic security audits
 Periodic system-usage summaries
Summary: Creating a Database

Describes what
system contains

describes HOW the system


will be implemented

describes HOW the system will


be implemented using a specific
DBMS
Designing Business Rules
Business Rules

 Are collected in the initial phase of database life cycle.


 Key points for writing business rules:
 Discover what company’s operational components are, how they function, how
information flows between them and how they interact
 Formal and informal information sources
 Describe characteristics of data as viewed by the company

Source: Database Systems , authors: Peter Rob and Carlos Coronel


Examples

 Customer can make many payments in the account


 Each payment should be in multiple of 100
 Working hours of the organization are between 8am-5pm
Why we need Business Rules?

 Standardize company’s view of data


 Communications tool between users and designers
 Allow designer to understand the nature, role, and scope of data
 Allow designer to understand business processes
 Allow designer to develop appropriate relationship participation rules and
constraints

Source: Database Systems , authors: Peter Rob and Carlos Coronel


Activity: Write some Business Rules for
Mylo Website
Data Modelling
Conceptual Model
(Entity-Relationship (E-R) Models)

 Graphical representation of entities and their relationships in a


database structure
 Widely accepted and adapted graphical tool for data modeling
 Introduced by Chen in 1976
 Many extensions/variations exist
 Basis for most other modeling approaches
ER Model - Basic Building Blocks

 Entity - anything about which data are to be collected and stored


 Attribute - a characteristic of an entity
 Relationship - describes an association among entities
 One-to-many (1:M) relationship
 Many-to-many (M:N or M:M) relationship
 One-to-one (1:1) relationship

 Participation - a restriction placed on the data


Synonyms you should know…

Entity = class = relation = table


Attribute = column columns

Instance = row
le
rows
tab
Many Conventions

We will useONLY
CONVENTIONS as
given in tutorial handout

Source: Database Systems , authors: Peter Rob and Carlos Coronel


Translating Business Rules into ERD Components

 Nouns translate into entities


 Verbs translate into relationships among entities
 Relationships are bidirectional
 Questions to identify the relationship type
 How many instances of B are related to one instance of A?
 How many instances of A are related to one instance of B?
 Examples
 How many classes can one student enroll in? Many
 How many students can be enrolled in one class? Many
 Relationship between Student and Class is: Many to Many, *-*
71

Naming Conventions

 Entity names - Required to:


 Be descriptive of the objects in the business environment
 Use terminology that is familiar to the users
 Attribute name - Required to be descriptive of the data represented by the attribute
 Proper naming:
 Facilitates communication between parties
 Promotes self-documentation
CAR RENTAL
ENTERPRISE what should happen to a car
a rental is for exactly one customer that’s never rented?

•we care about cars that are rented by


customers are customers (not other cars) and a given car may be used for no rentals
important to the •we care about customers (people) who rent or for many rentals
enterprise cars (not other people)
cars are important to the
enterprise

CUSTOMER c-makes-r cu-rents-ca r-is for-c CAR


rental

EZ-Rent only rents


•customer name
Ford and GM cars
•customer address •car make
•customer since - •customer who rented •car model
•customer telephone number •which car was rented •car model year
•pick-up location manufacturer
•date out •vehicle identification no. (VIN)
th
t h e ch •time out •number of doors
at a •colour
ar rac customer who rented must be •mileage out no car has more
e i te known to EZ-Rent •return location •date purchased than 5 doors
m p r is car rented must belong to EZ-Rent
or tics •date in •licence number
ta o •licence state
nt f c •time in
: us •miles driven must be one of the states
to •mileage in
me the characteristics of rental that are date/time in must be later than •car style EZ-Rent operates in
rsimportant:
date/time out
must be one
of:
•sedan
•coupe
•wagon
•minivan
•sport utility
•truck
•convertible
ER Conventions for KIT712
 DIAGRAMMING CONVENTIONS
 NAMING CONVENTIONS
ER Modeling in KIT712

 In this unit, students will be asked to draw simple (conceptual) ER diagrams


to model given scenarios
 Students should use the simple version of the Crows Feet conventions explained in
the following lecture slides and tutorials
 Students should not use the conventions from the Rob and Coronel book
(or from any other source)
Diagramming Conventions

 An Entity is represented by a rectangle


 A Relationship is represented by a line joining two entities
 Attributes are written in a list next to the entity or relationship to
which they belong
 Identifiers are placed at the top of the list of attributes and are
underlined
Diagramming Conventions - Entity

 An Entity is represented by a rectangle


 Thename of an entity should be a noun or a noun
phrase
 The name of an entity is written in UPPERCASE
 Thename of an entity is written inside the rectangle
representing the entity
Diagramming Conventions - Attributes

 Attributes are written in a list next to the entity or


the relationship to which they belong
 The name of an attribute is usually a noun or a noun
phrase – sometimes the name is an adjective
 An attribute is written in Lowercase With Initial
Capitals
 Identifiers are placed at the top of the list of
attributes and are Underlined
Example of a Entity

Modelling units offered at the University of Woolloomooloo

Note: This initial model is not necessarily perfect


Diagramming Conventions - Relationship

 A Relationship is represented by a line joining two entities


 All relationships are binary (or recursive)
- no n-nary (eg ternary) relationships

 The name of a relationship should be a verb or a verb phrase


 The name of a relationship is written in lowercase – preferably also in
italics
 The name of a relationship is near the line representing the relationship
Example of a Relationship

Modelling the association between students and the units that they are enrolled in

Note: Because we are thinking about the relationship


at this stage, I have not listed the attributes
for the entities
Another Example of a Relationship

Modelling the association between a hospital ward and the patients admitted
to the ward

Note: The attribute Date-admitted belongs to


the relationship
Diagramming Conventions – Relationship Cardinality

 The cardinality of a relationship indicates the number of possible occurrences


of an entity participating in a given relationship
 We will add crows feet to relationship lines to indicate cardinality
Diagramming Conventions – Crows Feet for Cardinality

 Crows feet indicate that many (zero or more) instances of


the entity adjacent to the crows feet may be associated with
each instance of the entity at the other end of the relationship
line
 Anabsence of crows feet indicates that zero or one instances of
the entity adjacent to the absence of crows feet may be
associated with each instance of the entity at the other end of
the relationship line
Example of Relationship Cardinality – one-to-many

 Each patient is receiving, at most, one type of treatment


 Each type of treatment may be received by many patients

Note: This initial model is not necessarily perfect


Example of Relationship Cardinality – many-to-many

 Each patient is receiving many types of treatments


 Each type of treatment is received by
many patients
Diagramming Conventions – Relationship Participation

 Participation indicates whether all, or only some, entity


occurrences participate in a relationship
 We use the | symbol on relationship lines to indicate
mandatory participation
 We use the O symbol on relationship lines to indicate
optional participation
Another Example of Relationship Participation

If we wish to indicate that it is mandatory for a student to be enrolled in at least one


unit, we add a stroke ( | ) at the end of the relationship line near the entity UNIT
Example of Relationship Participation

If we wish to indicate that it is not necessary for a standardised treatment to be


received by any patients, we add an O (for Optional) at the end of the line near the
entity PATIENT
Updated Example of Relationship Participation

If we also wish to indicate that it is not necessary


for a patient to be receiving any standardised
treatments, we add an O (for Optional) at the end
of the line near the entity PATIENT
When to Indicate Participation

 We will add O or | to our relationship lines only if the


scenario specifically indicates that participation is optional
or mandatory
 Some people choose to mark every end of every
relationship line with either | or O
 Some people choose to use only | (mandatory) symbols
ER Diagram – Example 1

Finnegan’s Falderals Factory - Projects


Notes on
ER Diagram – Example 1

 ACTIVITY is a subordinate entity


(also called a weak entity)
 Each instance of ACTIVITY must be associated with an instance of PROJECT
 Note the mandatory symbol (|) near PROJECT
 The identifier of ACTIVITY is derived from the identifier of PROJECT
 The identifier of ACTIVITY is {Project-id, Activity-no}
 The identifier of PROJECT is {Project-id}
ER Diagram – Example 2

Fred Friendly’s Factory - Projects


Notes on
ER Diagram – Example 2

 must precede is a recursive relationship


(also called a unary relationship)
 Each instance of TASK may be associated with many instances of TASK
 is part of is a binary relationship
 The two entities in the relationship are TASK and PROJECT
 Each instance of PROJECT may be associated with many instances of TASK
Our ER Naming Conventions
for (Conceptual) ER Diagrams

 The name of an entity is written in UPPERCASE (also known as


ALL CAPITALS)
 The name of a relationship is written in lowercase – preferably also
in italics
 The name of an attribute is written in Lowercase With Initial Capitals
 Note that each type of name is written differently
Creating an ER Diagram

 Use the ER Conventions when drawing ER diagrams from scenarios


 Give each ER Diagram a title
eg
 Finnegan’s Falderals Factory - Projects
 University of Woolloomooloo

 Unless explicitly told otherwise,


include attributes on the diagram
ER Conventions for KIT712 - Handout

 Contains more information and more examples


 Available on MyLO
Example Scenario
Canterbury Cat Club has decided to create a database to store information about
the cats that belong to its members.
Each cat is allocated a unique identifier. The Club also stores the following data
about each cat: name, sex, age, and spayed status.
Each cat may have one or more owners, and each owner may own one or more
cats. One owner is identified as the primary contact for each cat.
Each cat owner must be a member of the Club, therefore each owner has a unique
Member Number. The Club also stores the following data about each owner:
first name, surname, phone number, and address.
Each owner has only one address, but some owners share the same address.
The Club stores the following data about each address: street number, street name,
suburb, state, and postcode.
ER Diagram of Given Scenario
(Conceptual Model)

Canterbury Cat Club


Bad Business Rules

 Discussing implementation details such as foreign keys and


Primary keys
 Discussing what is entity or relationship
 Not giving details as required
Assignment 1(More Information announced
this week)
 Consider a scenario [details will be in assignment description] where you have to design a
database for an organisation
 Part 1
 Business Rules
 Improve based on comments given by Lecturer on your submission
 Part 2
 ER Modelling based on updated business rules
 Relational Model
 Submission by Mylo
 If done in group of two, names/ids should be provided during submission

You might also like