0% found this document useful (0 votes)
5 views

SQL Basics

SQL (Structured Query Language) is designed to manipulate and manage data stored in relational databases. The document discusses SQL basics including SQL statements, functions to create, manipulate, and query data from tables.

Uploaded by

fanlingjie1502
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

SQL Basics

SQL (Structured Query Language) is designed to manipulate and manage data stored in relational databases. The document discusses SQL basics including SQL statements, functions to create, manipulate, and query data from tables.

Uploaded by

fanlingjie1502
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

SQL basics

2023 年 7 月 28 日

The most common shape for data is a spreadsheet or table. The things we are measuring (variables) are in the columns,
and the individual instances (observations) are in the rows. We can read each column “down” the table (viewing
multiple observations), and each row “across” the table (viewing multiple variables).

What is SQL  SQL (Structured Query Language) is a programming language designed to manipulate and manage
data stored in relational databases.
 SQL 指结构化查询语言,就是访问和处理关系数据库的计算机标准语言

What is  relational database 就是表格,relational 可以理解为每个数据有行列的关系,也是 SQL 可以处理的


relational database。反之 non-relational database 即 NoSQL database,比如一个 word 文档,里面的数据都堆积在
database 一起
 A relational database is a database that organizes information into one or more tables.
 A table is a collection of data organized into rows and columns

What is  SQL statement - text that database recognizes as a valid command, usually end in[;],statement 会有
statement clause 和 parameter(括号内容)两部分,clause 是具体的指令,通常全部大写
 A statement is a string of characters that the database recognizes as a valid command.

What is  Retrieving information stored in a database, one of the core purposes of the SQL language
querying

MANIPULATIVE FUNCTIONS

CREATE 根据数据库中的数据创建表格 create a new table in the database


TABLE CREATE TABLE table_name (
column_1 data_type,
column_2 data_type,
column_3 data_type
);
 INTEGER, a positive or negative whole number
 TEXT, a text string
 DATE, the date formatted as YYYY-MM-DD
 REAL, a decimal value

constraints Constraints come after the data type, which are rules applied to the values of individual columns
CREATE TABLE celebs (
id INTEGER PRIMARY KEY,
name TEXT UNIQUE,
date_of_birth TEXT NOT NULL,
date_of_death TEXT DEFAULT 'Not Applicable' );
 Constraints 都是大写
 [PRIMARY KEY] Attempts to insert a row with an identical value to a row already in the table will
result in a constraint violation(i.e., 基本就是序号,不能输入一样的,一个 table 只能设置一个列为
PRIMARY KEY)
 [UNIQUE] columns have a different value for every row, similar function to PRIMARY KEY except a
table can have many different UNIQUE columns
 [NOT NULL] columns must have a value
 [DEFAULT] columns take an additional argument that will be the assumed value for an inserted
row if the new row does not specify a value for that column. (i.e., 给该列设置预设值)

INSERT 给表格新增行 inserts a new row into a table


INTO… INSERT INTO table_name (column_1, column_2, column3)
VALUES… VALUES (value_1, value_2, value3);
 [INSERT INTO] clause that adds the specified row or rows + parameter identifying the columns
that data will be inserted into
 [VALUES] clause that indicates the data being inserted + parameter identifying the values being
inserted

ALTER 给表格新增列 allow change making to the table, add a new column to the table
TABLE… ALTER TABLE table_name
ADD ADD COLUMN column_name data_type;
COLUMN…  更改列才叫更改表格,因为每行的内容只是 record / observation
 ALTER modifies the structure of the table (by adding, removing or renaming columns)
 UPDATE modifies the information contained in the table
 [NULL] is a special value in SQL that represents missing or unknown data. Here, the rows that
existed before the column was added have NULL (∅) values for the new column
 By default, a new column will always be added at the end of the table

UPDATE… 更改表格中的数据(行) change existing records


SET… UPDATE table_name
WHERE… SET column_1 = value_1_new
WHERE condition;

 [WHERE] is a clause that indicates which row(s) to make changes

DELETE 删除表格中的数据(行) remove one ore more rows from a table


FROM… DELETE FROM table_name
WHERE… WHERE column_name IS NULL; # delete all records in the table with no value in this column
IS…
DELETE FROM table_name
LIMIT value_integer; # delete only the first x rows that match the condition

DELETE FROM table_name


WHERE column_1 in (row_1, row_2, row_3) and column_2 IS NULL # Among row 1&2&3, delete those
that has no value in column_2

 [IS NULL] is a condition in SQL that returns true when the value is NULL and false otherwise.
 If want to get rid of the row without changing positions, just use UPDATE and set the
observation to NULL

QUERY FUNCTIONS

SELECT… 从表格中调用数据 fetch data from a database


FROM SELECT column_1, column_2 FROM table_name; # select specific columns form the table
SELECT * FROM table_name; # [*] is a wildcard character to select all columns in a table
SELECT column_1 AS 'alias_name' FROM table_name; # create an alias name for the column queried,
not actually renaming the column

 SELECT statements always return a new table called the result set

SELECT 从表格中调用数据,但重复项只显示一次 used to return unique values in the output


DISTINCT… SELECT DISTINCT column_name
FROM table_name;

SELECT… 从表格中调用符合特定条件的数据 filters the result set to only include rows where the following condition
FROM… is true
WHERE… SELECT */column_name FROM table_name
WHERE condition;

 Operators includes =, !=, >, <, >=, <=


 Operators create a condition that can be evaluated as either true or false

 [LIKE] a special operator used within the WHERE clause to search for a specific pattern in a column,
usually followed by a pattern with wildcard characters
 [_], meaning that you can substitute any individual character here without breaking the pattern.
 [%], a wildcard character that matches zero or more missing letters in the pattern
SELECT */column_name FROM table_name
WHERE column_name LIKE 'Se_en'; #The names Seven and Se7en both match this pattern.
WHERE column_name LIKE '%A%'; #return results that matche all names containing 'man'

 [IS NULL] / [IS NOT NULL] special operators used within the WHERE clause to identify missing
values
SELECT */column_name FROM table_name
WHERE column_name IS NULL;
WHERE column_name IS NOT NULL;

 [BETWEEN] special operator used within the WHERE clause to filter the result set within a certain
range, it accepts two values that are either numbers, text or dates; The range differs depending on
data type, for numbers/dates, it includes both end, but for text, it filters the result set for within the
alphabetical range (not including the latter end)
SELECT */column_name FROM table_name
WHERE column_name BETWEEN 1990 AND 2000; #return results that only have year value of
number between 1990-2000 (includes both ends)
WHERE column_name BETWEEN 'A' AND 'J'; #return results that begin with the letter ‘A’ up
to, but not including ones that begin with ‘J’, However, if a movie has a name of simply
‘J’, it would actually match. This is because BETWEEN goes up to the second value — up to
‘J’. So the movie named ‘J’ would be included in the result set but not ‘Jaws’.

 [AND] special operator used within the WHERE clause to combine multiple conditions, with AND,
both conditions must be true for the row to be included in the result.
SELECT */column_name FROM table_name
WHERE condition_1 AND condition_B
 [OR] similar to AND, but displays a row if any condition is true
SELECT */column_name FROM table_name
WHERE condition_1 OR condition_B

SELECT… 从表格中调用数据并按一定规律排序 list the data in our result set in a particular order
FROM… SELECT * FROM movies
ORDER BY… ORDER BY column_name; #sort the result set by a particular column.
ORDER BY column_name DESC/ASC; #sort the result by a particular column and in descending /
ascending order
ORDER BY column_1, column_2; #sort the results set first by column1 then by column2

 IF WHERE clause is used, ORDER BY always goes after the WHERE clause

SELECT… 从表格中调用规定行数的数据 a clause that lets you specify the maximum number of rows the result set
FROM… will have
LIMIT… SELECT * FROM table_name
LIMIT number;

 [LIMIT] always goes at the very end of the query


 If the number set in the LIMIT clause surpasses the number of rows available to select, then it will
just exclude the remaining amount of rows in the result set

CASE 条件语句 SQL's if-then logic, allow us to create different output


SELECT column_1,
CASE
WHEN condition_1 THEN output_1
ELSE output_2
END/END AS case_name #Set the return column name as value
FROM table_name; #Return column_1 and case_name

 [CASE] statement is usually put in the SELECT statement, and it must end with END

AGGREGATE FUNCTIONS

COUNT 计算指定列的非空行数的函数 takes the name of a column as an argument and counts the number of non-
empty values in that column
SELECT COUNT(column_name)
FROM table_name;
SUM 计算指定列的各行值之和的函数 takes the name of a column as an argument and returns the sum of all the
values in that column
SELECT SUM(column_name)
FROM table_name;

MAX / MIN 计算指定列的最大和最小值的函数 takes the name of a column as an argument and returns the largest
value in that column
SELECT MAX(column_name) / MIN(column_name)
FROM fake_apps;

AVERAGE 计算指定列各行值的平均数的函数 taking a column name as an argument and returns the average value
for that column
SELECT AVG(column_name)
FROM table_name;

ROUND 对指定列的各行值取整的函数 takes two arguments inside the parenthesis: 1.a column name 2. an integer.
It rounds the values in the column to the number of decimal places specified by the integer
SELECT ROUND(column_name, integer)
FROM table_name;

SELECT column_1 ROUND(column_2, 2) #return column_1 and column_2 rounded to two decimal places
FROM table_name

GROUP BY 根据指定列的项对 SELECT statement 返回的值进行分组 a clause in SQL that is used with aggregate
functions. It is used in collaboration with the SELECT statement to arrange identical data into groups
SELECT column_1, aggregate_function(column2)
FROM table_name
GROUP BY column_1;

 The GROUP BY statement comes after any WHERE statements, but before ORDER BY or LIMIT
 SQL lets us use column reference(s) in our GROUP BY / ORDER BY,
1 is the first column selected
2 is the second column selected
3 is the third column selected
SELECT ROUND(column_1),
COUNT(column_2)
FROM table_name
GROUP BY 1,
ORDER BY 1;
 [HAVING] is very similar to WHERE. In fact, all types of WHERE clauses you learned about thus far
can be used with HAVING. HAVING statement always comes after GROUP BY, but before ORDER BY
and LIMIT
 When we want to limit the results of a query based on values of the individual rows, use WHERE.
 When we want to limit the results of a query based on an aggregate property, use HAVING.
SELECT ROUND(column_1),
COUNT(column_2)
FROM table_name
GROUP BY 1,
HAVING COUNT(column_2) > 5; #return round(column_1) and count(column_2) when the result > 5

MULTIPLE TABLE
JOIN 根据指定重复列串联/合并表格 combine tables
SELECT *
FROM table_1
JOIN table_2
ON table_1.column_1 = table_2.column_1; # match table_1’s column_1 with table_2's column_2

 The table_name.column_name syntax can be not only used in the ON statement, but we can also
use it in the SELECT or any other statement where we refer to column names
SELECT table_1.column_1, table_2.column_2
FROM table_1
JOIN table_2
ON table_1.column_1 = table_2.column_1;
 [INNER JOIN 合并同类项] is the default JOIN and it will only return results matching the condition
specified by ON
 [LEFT JOIN 保留表 1&表 2 的同类项和表 1 的剩余项] will keep all rows from the first table, regardless
of whether there is a matching row in the second table. If the join condition is not met, LEFT JOIN
will fill columns on the right table with NULL
SELECT *
FROM table1
LEFT JOIN table2
ON table1.c2 = table2.c2;
 [CROSS JOIN 用表 1 的每一项和表 2 的每一项结合] comine all rows of one table with all rows of
another table, able to know all the possible combinations. A common usgage is when we need to
compare each row of a table to a list of values. CROSS JOIN don’t require an ON statement.
You’re not really joining on any columns!
SELECT table1.column1, table2.column2
FROM table1
CROSS JOIN table2;

Primary key 主键&外键


& foreign  None of the values can be NULL.
key  Each value must be unique (i.e., you can’t have two customers with the same customer_id in the
customers table).
 A table cannot have more than one primary key column.
 When the primary key for one table appears in a different table, it is called a foreign key
 Generally, the primary key will just be called id
 The most common types of joins will be joining a foreign key from one table with the primary key
from another table

UNION 合并有相同列的两个表的所有行 Stack table1 on top of table2, tables must have the same number of
columns and the same data types in the same order asd the first table
SELECT * FROM table1
UNION
SELECT * FROM table2;
 we can also use UNION to quickly make a “mini” dataset:

WITH 嵌套另一个查询语句 Essentially, we are putting a whole first query inside the parentheses () and giving it
a name. After that, we can use this name as if it’s a table and write a new query using the first query.
WITH previous_results AS (
SELECT ...
...
...
… #注意括号内这个 query 不要分号结束
)
SELECT *
FROM previous_results
JOIN table2
ON _____ = _____;
 [WITH] statement allows us to perform a separate query
 [previous_results] is the alias that we will use to reference any columns from the query inside of the
WITH clause

Subqueries 子查询 Subqueries are very similar to joins in terms of functionality


SELECT id, first_name, last_name
FROM book_club
JOIN art_club
ON book_club.id = art_club.id;
 用 subquery 实现 JOIN
SELECT id, first_name, last_name
FROM book_club
JOIN art_club
ON book_club.id = art_club.id;

SELECT id, first_name, last_name


FROM book_club
WHERE id IN (
SELECT id
FROM art_club);
# 以上两个 query 结果一样
 Anytime a subquery is present, it gets executed before the external statement is run.
 We can use operators such as <, >, =, and != to compare the results of the external query to those
of the inner query
SELECT *
FROM history_students
WHERE grade <= (
SELECT grade
FROM statistics_students
WHERE id = 1);
# if Olivia decided to drop statistics and take history, we could find out how many history students
are at or below her grade level by performing the following query
 [IN] & [NOT IN]
One of the more common ways to use subqueries is with the use of an IN or NOT IN clause. When
an IN clause is used, results retrieved from the external query must appear within the subquery
results. Similarly, when a NOT IN clause is used, results retrieved from the external query must not
appear within the subquery results.
 [EXIST] & [NOT EXIST]
Recall that when a subquery is included, the inner query runs before the external query. When the
inner query is included using an IN or NOT IN clause, all rows meeting the inner query’s criteria are
returned and then compared against the external query’s criteria. However, when the inner query
is included using an EXISTS or NOT EXISTS clause, we are only checking for the presence of rows
meeting the specified criteria, so the inner query only returns a true or false.
If we compare this functionality in terms of efficiency, EXISTS/NOT EXISTS are usually more efficient
than IN/NOT IN clauses; this is because the IN/NOT IN clause has to return all rows meeting the
specific criteria whereas the EXISTS/NOT EXISTS only needs to find the presence of one row to
determine if a true or false value needs to be returned.
SELECT grade
FROM band_students
WHERE EXISTS (
SELECT grade
FROM drama_students
);

You might also like