0% found this document useful (0 votes)
2 views

06_database_design_part_II

The document provides an overview of SQL, its history, usage, and various types of SQL statements. It explains the structure and elements of SQL queries, including commands for data manipulation, definition, and control. Additionally, it introduces NoSQL databases and discusses the differences between relational and non-relational database systems.

Uploaded by

gegis.jkuat2021
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

06_database_design_part_II

The document provides an overview of SQL, its history, usage, and various types of SQL statements. It explains the structure and elements of SQL queries, including commands for data manipulation, definition, and control. Additionally, it introduces NoSQL databases and discusses the differences between relational and non-relational database systems.

Uploaded by

gegis.jkuat2021
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

CONCEPTS IN DATABASE SYSTEMS - II

Monday, November 4, 2024 Felix Mutua, Ph.D.


Outline for Today’s Lecture
2

 SQL
 Spatial databases
 Implementation
What is SQL?
3

 SQL is the standard language for dealing with Relational Databases.


SQL can be used to insert, search, update, and delete database
records.
 SQL can do lots of other operations, including optimizing and
maintenance of databases.
 SQL stands for Structured Query language, pronounced as "S-Q-L" or
sometimes as "See-Quel"... Relational databases like MySQL, Postgres
Database, Oracle, MS SQL Server, Sybase, etc. use ANSI SQL.

Felix Mutua, PhD


How to Use SQL
4

 SQL Code Example:

SELECT * FROM Students WHERE Age > 21

SQL syntaxes used in different databases are almost similar, though few RDBMS use a few different commands and
even proprietary SQL syntaxes.

Felix Mutua, PhD


What is SQL used for?
5

 Here are important reasons for using SQL


 It helps users to access data in the RDBMS system.
 It helps you to describe the data.

 It allows you to define the data in a database and manipulate that specific
data.
 With the help of SQL, you can create and drop databases and tables.

 SQL offers you to use the function in a database, create a view, and stored
procedure.
 You can set permissions on tables, procedures, and views.

Felix Mutua, PhD


A Brief History of SQL
6

 1970 - Dr. Edgar F. "Ted" Codd described a relational model for


databases.
 1974 - Structured Query Language appeared.
 1978 - IBM released a product called System/R.
 1986 - IBM developed the prototype of a relational database, which is
standardized by ANSI.
 1989 - First ever version launched of SQL
 1999 - SQL 3 launched with features like triggers, object-orientation, etc.
 SQL 2003 - window functions, XML-related features, etc.
 SQL 2006 - Support for XML Query Language
 SQL 2011 -improved support for temporal databases
Felix Mutua, PhD
Types of SQL Statements
7

 Five types of widely used SQL queries.


 Data Definition Language (DDL)
 Data Manipulation Language (DML)

 Data Control Language (DCL)

 Transaction Control Language (TCL)

 Data Query Language (DQL)

Felix Mutua, PhD


Types of SQL Statements
8

 Data Definition Language (DDL)


 Create/alter/delete tables and their attributes
 Following lectures...
 Data Manipulation Language (DML)
 Query one or more tables – discussed next !
 Insert/delete/modify tuples in tables
 Transact-SQL
 TCL commands deal with the transaction within the database. E.g coomit, Rollback, etc
 Data Control Language (DCL)
 commands such as GRANT and REVOKE which mainly deal with the rights, permissions
and other controls of the database system.
 Data Query Language (DQL)
 used for performing queries on the data within schema objects. E.g SELECT
List of SQL Commands
9

 Some of the most commonly used SQL commands:


 CREATE - defines the database structure schema
 INSERT - inserts data into the row of a table

 UPDATE - updates data in a database

 DELETE - removes one or more rows from a table

 SELECT - selects the attribute based on the condition described by the


WHERE clause
 DROP - removes tables and databases

Felix Mutua, PhD


SQL Language elements
10

 Keywords: Each SQL statement contains single or multiple keywords.


 Identifiers: Identifiers are names of objects in the database, like user IDs, tables, and columns.
 Strings: Strings can be either literal strings or expressions with VARCHAR or CHAR data types.
 Expressions: Expressions are formed from several elements, like constants, SQL operators, column
names, and subqueries.
 Search Conditions: Conditions are used to select a subset of the rows from a table or used to control
statements like an IF statement to determine control of flow.
 Special Values: Special values should be used in expressions and as column defaults when building
tables.
 Variables: Sybase IQ supports local variables, global variables, and connection-level variables.
 Comments: Comment is another SQL element which is used to attach explanatory text to SQL
statements or blocks of statements. The database server does not execute any comment.
 NULL Value: Use NULL, which helps you to specify a value that is unknown, missing, or not applicable.

Felix Mutua, PhD


What is NoSQL?
11

 NoSQL is an upcoming category of Database Management Systems. Its


main characteristic is its non-adherence to Relational Database Concepts.
NoSQL means "Not only SQL".
 The concept of NoSQL databases grew with internet giants such as Google,
Facebook, Amazon etc. who deal with gigantic volumes of data.
 NoSQL database is non-relational databases that scale-out better than
relational databases and are designed with web applications in mind. They
do not use SQL to query the data and do not follow strict schemas like
relational models. With NoSQL, ACID (Atomicity, Consistency, Isolation,
Durability) features are not guaranteed always.
Felix Mutua, PhD
12 Using SQL

Felix Mutua, PhD


Data Types in SQL
13

 Characters:
 CHAR(20) -- fixed length
 VARCHAR(40) -- variable length
 Numbers:
 BIGINT, INT, SMALLINT, TINYINT
 REAL, FLOAT -- differ in precision
 MONEY
 Times and dates:
 DATE
 DATETIME -- SQL Server
 Others... All are simple
Tables in SQL
14

Product Attribute names


Table name
PName Price Category Manufacturer

Gizmo $19.99 Gadgets GizmoWorks

Powergizmo $29.99 Gadgets GizmoWorks

SingleTouch $149.99 Photography Canon

MultiTouch $203.99 Household Hitachi

Tuples or rows
Tables Explained
15

 A tuple = a record
 Restriction: all attributes are of atomic type
 A table = a set of tuples
 Like a list…
 …but it is unordered: no first(), no next(), no last().
Tables Explained
16

 The schema of a table is the table name and its attributes:


Product(PName, Price, Category, Manfacturer)

 A key is an attribute whose values are unique;


we underline a key

Product(PName, Price, Category, Manfacturer)


SQL Query
17

Basic form: (plus many many more bells and whistles)

SELECT attributes
FROM relations (possibly multiple)
WHERE conditions (selections)
Simple SQL Query
18

Product PName Price Category Manufacturer


Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi

SELECT *
FROM Product
WHERE category=‘Gadgets’

PName Price Category Manufacturer


Gizmo $19.99 Gadgets GizmoWorks

“selection” Powergizmo $29.99 Gadgets GizmoWorks


Simple SQL Query
19

Product PName Price Category Manufacturer


Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi

SELECT PName, Price, Manufacturer


FROM Product
WHERE Price > 100

PName Price Manufacturer


“selection” and SingleTouch $149.99 Canon
“projection” MultiTouch $203.99 Hitachi
A Notation for SQL Queries
20

Input Schema

Product(PName, Price, Category, Manfacturer)

SELECT PName, Price, Manufacturer


FROM Product
WHERE Price > 100

Answer(PName, Price, Manfacturer)

Output Schema
Selections
21

What goes in the WHERE clause:


 x = y, x < y, x <= y, etc

 For number, they have the usual meanings


 For CHAR and VARCHAR: lexicographic ordering
 Expected conversion between CHAR and VARCHAR
 For dates and times, what you expect...
 Pattern matching on strings...
The LIKE operator
22

 s LIKE p: pattern matching on strings


 p may contain two special symbols:
 % = any sequence of characters
 _ = any single character

Product(PName, Price, Category, Manufacturer)


Find all products whose name mentions ‘gizmo’:

SELECT *
FROM Products
WHERE PName LIKE ‘%gizmo%’
Eliminating Duplicates
23

Category
SELECT DISTINCT category Gadgets
FROM Product
Photography
Household

Compare to:

Category
Gadgets
SELECT category
Gadgets
FROM Product
Photography
Household
Ordering the Results
24

SELECT pname, price, manufacturer


FROM Product
WHERE category=‘gizmo’ AND price > 50
ORDER BY price, pname

Ordering is ascending, unless you specify the DESC keyword.

Ties are broken by the second attribute on the ORDER BY list, etc.
Ordering the Results
25

SELECT category
FROM Product
ORDER BY pname

PName Price Category Manufacturer

?
Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi
Ordering the Results
26

Category
SELECT DISTINCT category Gadgets
FROM Product
Household
ORDER BY category
Photography

Compare to:

?
SELECT DISTINCT category
FROM Product
ORDER BY pname
Joins in SQL
27

 Connect two or more tables:

Product PName Price Category Manufacturer


Gizmo $19.99 Gadgets GizmoWorks
Powergizmo $29.99 Gadgets GizmoWorks
SingleTouch $149.99 Photography Canon
MultiTouch $203.99 Household Hitachi

Company Cname StockPrice Country

What is GizmoWorks 25 USA


the connection Canon 65 Japan
between
them ? Hitachi 15 Japan
Joins
28

Product (pname, price, category, manufacturer)


Company (cname, stockPrice, country)

Find all products under $200 manufactured in Japan;


return their names and prices.
Join
between Product
and Company

SELECT pname, price


FROM Product, Company
WHERE manufacturer=cname AND country=‘Japan’
AND price <= 200
Joins in SQL
29

Product Company

PName Price Category Manufacturer Cname StockPrice Country


Gizmo $19.99 Gadgets GizmoWorks GizmoWorks 25 USA
Powergizmo $29.99 Gadgets GizmoWorks Canon 65 Japan
SingleTouch $149.99 Photography Canon Hitachi 15 Japan
MultiTouch $203.99 Household Hitachi

SELECT pname, price


FROM Product, Company
WHERE manufacturer=cname AND country=‘Japan’
AND price <= 200

PName Price
SingleTouch $149.99
Joins
30

Product (pname, price, category, manufacturer)


Company (cname, stockPrice, country)

Find all countries that manufacture some product in the ‘Gadgets’ category.

SELECT country
FROM Product, Company
WHERE manufacturer=cname AND category=‘Gadgets’
Joins in SQL
31

Product Company

Name Price Category Manufacturer Cname StockPrice Country


Gizmo $19.99 Gadgets GizmoWorks GizmoWorks 25 USA
Powergizmo $29.99 Gadgets GizmoWorks Canon 65 Japan
SingleTouch $149.99 Photography Canon Hitachi 15 Japan
MultiTouch $203.99 Household Hitachi

SELECT country
FROM Product, Company
WHERE manufacturer=cname AND category=‘Gadgets’

Country
What is ??
the problem ? ??
What’s the
solution ?
Joins
32

Product (pname, price, category, manufacturer)


Purchase (buyer, seller, store, product)
Person(persname, phoneNumber, city)

Find names of people living in Seattle that bought some product in the
‘Gadgets’ category, and the names of the stores they bought such product
from

SELECT DISTINCT persname, store


FROM Person, Purchase, Product
WHERE persname=buyer AND product = pname AND
city=‘Seattle’ AND category=‘Gadgets’
When are two tables related?
33

 Foreign keys are a method for schema designers to tell you so


 A foreign key states that a column is a reference to the key of another table
ex: Product.manufacturer is foreign key of Company
 Gives information and enforces constraint
Disambiguating Attributes
34

 Sometimes two relations have the same attr:


Person(pname, address, worksfor)
Company(cname, address)
Which
SELECT DISTINCT pname, address address ?
FROM Person, Company
WHERE worksfor = cname

SELECT DISTINCT Person.pname, Company.address


FROM Person, Company
WHERE Person.worksfor = Company.cname
Tuple Variables
35

Product (pname, price, category, manufacturer)


Purchase (buyer, seller, store, product)
Person(persname, phoneNumber, city)

Find all stores that sold at least one product that the store
‘BestBuy’ also sold:

SELECT DISTINCT x.store


FROM Purchase AS x, Purchase AS y
WHERE x.product = y.product AND y.store = ‘BestBuy’

Answer (store)
Tuple Variables
36

General rule:
tuple variables introduced automatically by the system:

Product (name, price, category, manufacturer) SELECT name


FROM Product
WHERE price > 100

Becomes: SELECT Product.name


FROM Product AS Product
WHERE Product.price > 100

Doesn’t work when Product occurs more than once:


In that case the user needs to define variables explicitly.
37 RDBMS software

Felix Mutua, PhD


Sampled List of clients
38

Commercial Open Source (Free)

 Oracle  MySQL (partially free)


 Microsoft SQL Server  PostgreSQL (free software)
 PostgreSQL (free software)  SQLite (free software)
 IBM Db2  MariaDB (free software)
 Microsoft Access
 Hive
 Teradata
 Microsoft Azure SQL Database

Felix Mutua, PhD


MySQL
39

 MySQL is an open-source relational database management system (RDBMS).


 Its name is a combination of "My", the name of co-founder Michael Widenius's
daughter, and "SQL", the abbreviation for Structured Query Language.
 MySQL is free and open-source software under the terms of the GNU General
Public License, and is also available under a variety of proprietary licenses.
 MySQL has stand-alone clients that allow users to interact directly with a MySQL
database using SQL, but more often, MySQL is used with other programs to
implement applications that need relational database capability.
 MySQL is used by many database-driven web applications, including Drupal,
Joomla, phpBB, and WordPress.

Felix Mutua, PhD


PostgreSQL
40

 PostgreSQL is a powerful, open source object-relational database system


with over 30 years of active development that has earned it a strong
reputation for reliability, feature robustness, and performance.
 PostgreSQL comes with many features aimed to help developers build
applications, administrators to protect data integrity and build fault-
tolerant environments, and help you manage your data no matter how big
or small the dataset.
 In addition to being free and open source, PostgreSQL is highly extensible.
For example, you can define your own data types, build out custom
functions, even write code from different programming languages without
recompiling your database!
Felix Mutua, PhD
Features found in PostgreSQL
41

 Data Types
 Primitives: Integer, Numeric, String, Boolean
 Structured: Date/Time, Array, Range, UUID
 Document: JSON/JSONB, XML, Key-value (Hstore)
 Geometry: Point, Line, Circle, Polygon
 Customizations: Composite, Custom Types
 Data Integrity
 UNIQUE, NOT NULL
 Primary Keys
 Foreign Keys
 Exclusion Constraints
 Explicit Locks, Advisory Locks

Felix Mutua, PhD


Features found in PostgreSQL
42

 Extensibility
 Stored functions and procedures
 Procedural Languages: PL/PGSQL, Perl, Python (and many more)

 SQL/JSON path expressions

 Foreign data wrappers: connect to other databases or streams with a


standard SQL interface
 Customizable storage interface for tables

 Many extensions that provide additional functionality, including PostGIS

Felix Mutua, PhD


PostGIS
43

 PostGIS is a spatial database extender for PostgreSQL object-


relational database. It adds support for geographic objects allowing
location queries to be run in SQL.
 In addition to basic location awareness, PostGIS offers many features
rarely found in other competing spatial databases such as Oracle
Locator/Spatial and SQL Server.

Felix Mutua, PhD


PostGIS Features
44

 Processing and analytic functions for both vector and raster data for
splicing, dicing, morphing, reclassifying, and collecting/unioning with the
power of SQL
 raster map algebra for fine-grained raster processing
 Spatial reprojection SQL callable functions for both vector and raster data
 Support for importing / exporting ESRI shapefile vector data via both
commandline and GUI packaged tools and support for more formats via
other 3rd-party Open Source tools
 Packaged command-line for importing raster data from many standard
formats: GeoTiff, NetCDF, PNG, JPG to name a few

Felix Mutua, PhD


PostGIS Features contd..
45

 Rendering and importing vector data support functions for standard textual
formats such as KML,GML, GeoJSON,GeoHash and WKT using SQL
 Rendering raster data in various standard formats GeoTIFF, PNG, JPG,
NetCDF, to name a few using SQL
 Seamless raster/vector SQL callable functions for extrusion of pixel values
by geometric region, running stats by region, clipping rasters by a
geometry, and vectorizing rasters
 3D object support, spatial index, and functions
 Network Topology support
 Packaged Tiger Loader / Geocoder/ Reverse Geocoder / utilizing US
Census Tiger data

Felix Mutua, PhD


46 Introduction to Spatial Databases

Felix Mutua, PhD


Definition
47

 A spatial database is a database that is optimized for storing and


querying data that represents objects defined in a geometric space.
 Most spatial databases allow the representation of simple geometric
objects such as points, lines and polygons.
 Some spatial databases handle more complex structures such as 3D objects,
topological coverages, linear networks, and TINs (triangulated irregular
network).
 While typical databases have developed to manage various numeric and
character types of data, such databases require additional functionality to
process spatial data types efficiently, and developers have often added
geometry or feature data types.
Felix Mutua, PhD
Characteristics of Spatial Databases
48

 Database systems use indexes to quickly look up values; however, this


way of indexing data is not optimal for spatial queries.
 Instead, spatial databases use a spatial index to speed up database
operations.
 In addition to typical SQL queries such as SELECT statements, spatial
databases can perform a wide variety of spatial operations

Felix Mutua, PhD


Characteristics of Spatial Databases
49

 The following operations and many more are specified by the Open Geospatial
Consortium standard:
 Spatial Measurements: Computes line length, polygon area, the distance between
geometries, etc.
 Spatial Functions: Modify existing features to create new ones, for example by providing
a buffer around them, intersecting features, etc.
 Spatial Predicates: Allows true/false queries about spatial relationships between
geometries. Examples include "do two polygons overlap" or 'is there a residence located
within a mile of the area we are planning to build the landfill?' (see DE-9IM)
 Geometry Constructors: Creates new geometries, usually by specifying the vertices (points
or nodes) which define the shape.
 Observer Functions: Queries which return specific information about a feature such as the
location of the center of a circle
Felix Mutua, PhD
Spatial Database Management System
 Spatial Database Management System (SDBMS) provides the
capabilities of a traditional database management system (DBMS)
while allowing special storage and handling of spatial data.
 SDBMS:
 Works with an underlying DBMS
 Allows spatial data models and types

 Supports querying language specific to spatial data types

 Provides handling of spatial data and operations


SDBMS Three-layer Structure

 SDBMS works with a spatial


application at the front end and
Core Spatial
a DBMS at the back end Functionality

Interface to spatial application


 SDBMS has three layers: Taxonomy

Spatial application

Interface to DBMS
Data types

 Interface to spatial application

DBMS
Operations
Query language
 Core spatial functionality Algorithms
Access methods
 Interface to DBMS
Spatial Query Language
 Number of specialized adaptations of SQL
 Spatialquery language
 Temporal query language (TSQL2)

 Object query language (OQL)

 Object oriented structured query language (O2SQL)

 Spatial query language provides tools and structures specifically for


working with spatial data
 SQL3 provides 2D geospatial types and functions
Spatial Query Language Operations
 Three types of queries:
 Basicoperations on all data types (e.g. IsEmpty, Envelope, Boundary)
 Topological/set operators (e.g. Disjoint, Touch, Contains)

 Spatial analysis (e.g. Distance, Intersection, SymmDiff)


Spatial Data Entity Creation
 Form an entity to hold county names, states, populations, and geographies
CREATE TABLE County(
Name varchar(30),
State varchar(30),
Pop Integer,
Shape Polygon);
 Form an entity to hold river names, sources, lengths, and geographies
CREATE TABLE River(
Name varchar(30),
Source varchar(30),
Distance Integer,
Shape LineString);
Example Spatial Query
 Find all the counties that border on Contra Costa county
SELECT C1.Name
FROM County C1, County C2
WHERE Touch(C1.Shape, C2.Shape) = 1 AND C2.Name = ‘Contra Costa’;

 Find all the counties through which the Merced river runs
SELECT C.Name, R.Name
FROM County C, River R
WHERE Intersect(C.Shape, R.Shape) = 1 AND R.Name = ‘Merced’;

CREATE TABLE County( CREATE TABLE River(


Name varchar(30), Name varchar(30),
State varchar(30), Source varchar(30),
Pop Integer, Distance Integer,
Shape Polygon); Shape LineString);
Examples
 A database:
 Relation counties(sname: string, area: region, spop: int)
 Relation cities(cname: string, center: point; ext: region)
 Relation rivers(rname: string, route:line)
 SELECT * FROM rivers WHERE route intersects R
 SELECT cname, sname FROM cities, counties WHERE center inside area
 SELECT rname, length(intersection(route, Kisumu)) FROM rivers WHERE route intersects
Kisumu
Spatial Queries

 Selection queries: “Find all objects inside query q”,


inside-> intersects, north
 Nearest Neighbor-queries: “Find the closest object
to a query point q”, k-closest objects
 Spatial join queries: Two spatial relations S1 and S2, find
all pairs: {x in S1, y in S2, and x rel y= true}, rel= intersect,
inside, etc
Assignment 04
58

Each student will choose a system for a database design (based on the
previous assignment). You are required to:
1. Design
a) Prepare a revised ER diagram using pgAdmin 4 tool
b) Forward engineer the diagram to SQL
c) Execute the SQL
The revised diagram, the SQL text and an empty database backup(sql)
should be zipped together and uploaded to
https://forms.gle/EL3CRSWhjFuMfeTb7
Deadline next week.
Felix Mutua, PhD

You might also like