100% found this document useful (1 vote)
42 views

SQL Cheatshet

SQL is a standard language used to query and manage data in relational database management systems. It includes statements for data manipulation (DML), data definition (DDL), and data control (DCL). The SELECT statement retrieves data from one or more tables and consists of clauses like SELECT, FROM, WHERE, GROUP BY, HAVING, and ORDER BY that are evaluated in a defined order. Constraints like primary keys, unique keys, foreign keys, check constraints, default constraints, and NOT NULL help define the structure of tables and enforce data integrity.

Uploaded by

davita
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
42 views

SQL Cheatshet

SQL is a standard language used to query and manage data in relational database management systems. It includes statements for data manipulation (DML), data definition (DDL), and data control (DCL). The SELECT statement retrieves data from one or more tables and consists of clauses like SELECT, FROM, WHERE, GROUP BY, HAVING, and ORDER BY that are evaluated in a defined order. Constraints like primary keys, unique keys, foreign keys, check constraints, default constraints, and NOT NULL help define the structure of tables and enforce data integrity.

Uploaded by

davita
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

SQL(Structured Query Language)

SQL is a standard language that was designed to query and manage data
in relational database management systems (RDBMSs). An RDBMS is a
database management system based on the relational model (a semantic
model for representing data).

The goal of the relational model is to enable consistent representation of


data with minimal or no redundancy and without sacrificing completeness,
and to define data integrity (enforcement of data consistency) as part of
the model.

In any SQL dialect, the SQL statements are grouped together into several different types
of statements. These different types are:

• Data Manipulation Language (DML) is the set of SQL statements that focuses on
querying and modifying data. DML statements include SELECT, the primary focus
of this training, and modification statements such as INSERT, UPDATE, and DELETE.
• Data Definition Language (DDL) is the set of SQL statements that handles the
definition and life cycle of database objects, such as tables, views, and procedures.
DDL includes statements such as CREATE, ALTER, and DROP.
• Data Control Language (DCL) is the set of SQL statements used to manage
security permissions for users and objects. DCL includes statements such as
GRANT, REVOKE, and DENY.

Clauses
The Select clause selects columns

The FROM clause identifies which table is the source of the rows for the
query
The WHERE clause filters rows out of the results, keeping only those rows
that satisfy the specified condition

The GROUP BY clause takes the rows that met the filter condition and
groups them

The HAVING clause filters the groups based on its own predicate

The ORDER BY clause sorts the output

1. The FROM clause is evaluated first, to provide the source rows for the rest of the
statement. A virtual table is created and passed to the next step.
2. The WHERE clause is next to be evaluated, filtering those rows from the source
table that match a predicate. The filtered virtual table is passed to the next step.
3. GROUP BY is next, organizing the rows in the virtual table according to unique
values found in the GROUP BY list. A new virtual table is created, containing the
list of groups, and is passed to the next step. From this point in the flow of
operations, only columns in the GROUP BY list or aggregate functions may be
referenced by other elements.
4. The HAVING clause is evaluated next, filtering out entire groups based on its
predicate. The virtual table created in step 3 is filtered and passed to the next step.
5. The SELECT clause finally executes, determining which columns will appear in the
query results. Because the SELECT clause is evaluated after the other steps, any
column aliases (in our example, Orders) created there cannot be used in the
GROUP BY or HAVING clause.
6. The ORDER BY clause is the last to execute, sorting the rows as determined by its
column list.
When multiple operators appear in the same expression, SQL Server
evaluates them based on operator precedence rules. The following list
describes the precedence among operators, from highest to lowest:
1. ( ) (Parentheses)
2. * (Multiplication), / (Division), % (Modulo)
3. + (Positive), – (Negative), + (Addition), + (Concatenation), –
(Subtraction)
4. =, >, =, <=, <>, !=, !>, !< (Comparison operators)
5. NOT
6. AND
7. BETWEEN, IN, LIKE, OR
8. = (Assignment)

Tips
• Capitalize T-SQL keywords, like SELECT, FROM, AS, and so on.
Capitalizing keywords is a commonly used convention that makes it
easier to find each clause of a complex statement.
• Start a new line for each major clause of a statement.
• If the SELECT list contains more than a few columns, expressions, or
aliases, consider listing each column on its own line.
• Indent lines containing subclauses or columns to make it clear which
code belongs to each major clause.

Constraints
One of the greatest benefits of the relational model is the ability to define
data integrity as part of the model. Data integrity is achieved through rules
called constraints that are defined in the data model and enforced by the
RDBMS.
Primary Key Constraints
A primary key constraint enforces uniqueness of rows and also disallows
NULL marks in the constraint attributes. Each unique set of values in the
constraint attributes can appear only once in the table—in other words,
only in one row. An attempt to define a primary key constraint on a column
that allows NULL marks will be rejected by the RDBMS. Each table can have
only one primary key.

Unique Constraints

A unique constraint enforces the uniqueness of rows, allowing you to


implement the concept of alternate keys from the relational model in your
database. Unlike with primary keys, you can define multiple unique
constraints within the same table. Also, a unique constraint is not restricted
to columns defined as NOT NULL. According to standard SQL, a column
with a unique constraint is supposed to allow multiple NULL marks (as if
two NULL marks were different from each other). However, SQL Server’s
implementation rejects duplicate NULL marks (as if two NULL marks were
equal to each other).

Foreign Key Constraints


A foreign key enforces referential integrity. This constraint is defined on
one or more attributes in what’s called the referencing table and points to
candidate key (primary key or unique constraint) attributes in what’s called
the referenced table. Note that the referencing and referenced tables can
be one and the same. The foreign key’s purpose is to restrict the values
allowed in the foreign key columns to those that exist in the referenced
columns.

Check Constraints
A check constraint allows you to define a predicate that a row must meet
to be entered into the table or to be modified. For example, the following
check constraint ensures that the salary column in the Employees table will
support only positive values.

Default Constraints
A default constraint is associated with a particular attribute. It is an
expression that is used as the default value when an explicit value is not
specified for the attribute when you insert a row.

NOT NULL Constraint


When applied to a column, NOT NULL constraint ensure that a column
cannot have a NULL value.

INDEX Constraint
The INDEX constraints are created to speed up the data retrieval from the
database. An Index can be created by using a single or group of columns in
a table. A table can have a single PRIMARY Key but can have multiple
INDEXES. An Index can be Unique or Non Unique based on requirements.
The SQL Indexes
SQL Indexes are special lookup tables that are used to speed up the
process of data retrieval. They hold pointers that refer to the data stored in
a database, which makes it easier to locate the required data records in a
database table.

Types of Indexes:
• Unique Index
• Single-Column Index
• Composite Index
• Implicit Index

Unique indexes are used not only for performance, but also for data
integrity. A unique index does not allow any duplicate values to be inserted
into the table. It is automatically created by PRIMARY and UNIQUE constraints
when they are applied on a database table, in order to prevent the user from
inserting duplicate values into the indexed table column(s).

A single-column index is created only on one table column. The syntax is


as follows.

A composite index is an index that can be created on two or more


columns of a table. Its basic syntax is as follows.

Implicit indexes are indexes that are automatically created by the


database server when an object is created.

A clustered index in SQL is a type of index that determines the physical


order in which the data values will be stored in a table.
When a clustered index is defined on a specific column, during the creation
of a new table, the data is inserted into that column in a sorted order. This
helps in faster retrieval of data since it is stored in a specific order.

The SQL Non-Clustered index is similar to the Clustered index. When


defined on a column, it creates a special table which contains the copy of
indexed columns along with a pointer that refers to the location of the
actual data in the table. However, unlike Clustered indexes, a Non-
Clustered index cannot physically sort the indexed columns.

Filters
The TOP option is a proprietary T-SQL feature that allows you to limit the number or
percentage of rows that your query returns.

The OFFSET-FETCH filter in SQL Server 2012 is considered part of the


ORDER BY clause, which normally serves a presentation ordering purpose.
By using the OFFSET clause, you can indicate how many rows to skip, and
by using the FETCH clause, you can indicate how many rows to filter after
the skipped rows

Predicates

The BETWEEN predicate allows you to check whether a value is in a


specified range, inclusive of the two specified boundary values.
The LIKE predicate allows you to check whether a character string value
meets a specified pattern

T-SQL supports a predicate called EXISTS that accepts a subquery as input


and returns TRUE if the subquery returns any rows and FALSE otherwise.

Normalization
Normalization is a formal mathematical process to guarantee that each
entity will be represented by a single relation.

Steps to achieve normalized table:

1NF The first normal form says that the tuples (rows) in the relation
(table) must be unique, and attributes should be atomic

2NF The second normal form involves two rules. One rule is that the
data must meet the first normal form. The other rule addresses the
relationship between non-key and candidate key attributes. For every
candidate key, every non-key attribute has to be fully functionally
dependent on the entire candidate key

3NF The third normal form also has two rules. The data must meet
the second normal form. Also, all non-key attributes must be
dependent on candidate keys non-transitively. Informally this rule
means www.it-ebooks.info Chapter 1 Background to T-SQL Querying
and Programming 9 that all non-key attributes must be mutually
independent.
Data Warehouse
A data warehouse (DW) is an environment designed for data retrieval and
reporting purposes. When it serves an entire organization, such an
environment is called a data warehouse; when it serves only part of the
organization (such as a specific department) or a subject matter area in
the organization, it is called a data mart.

The simplest data warehouse design is called a star schema. The star
schema includes several dimension tables and a fact table. Each
dimension table represents a subject by which you want to analyze the
data.

Databases
Databases You can think of a database as a container of objects such as
tables, views, stored procedures, and other objects.

CASE Expressions
A CASE expression is a scalar expression that returns a value based on
conditional logic.

The two forms of CASE expression are simple and searched.

The simple form allows you to compare one value or scalar


expression with a list of possible values and return a value for the first
match. If no value in the list is equal to the tested value, the CASE
expression returns the value that appears in the ELSE clause (if one
exists). If a CASE expression doesn’t have an ELSE clause, it defaults to
ELSE NULL.

The searched CASE form is more flexible because it allows you to


specify predicates, or logical expressions, in the WHEN clauses rather
than restricting you to equality comparisons. The searched CASE
expression returns the value in the THEN clause that is associated
with the first WHEN logical expression that evaluates to TRUE. If none
of the WHEN expressions evaluates to TRUE, the CASE expression
returns the value that appears in the ELSE clause (or NULL if an ELSE
clause is not specified).

NULL Marks

A NULL value means no value or unknown. It does not mean zero or blank,
or even an empty string. Those values are not unknown.

Always keep in mind that T-SQL uses three-valued predicate logic, where
logical expressions can evaluate to TRUE, FALSE, or UNKNOWN

Note that all aggregate functions ignore NULL marks with one
exception—COUNT(*).

The correct definition of the treatment SQL has for query filters is “accept
TRUE,” meaning that both FALSE and UNKNOWN are filtered out.
Conversely, the definition of the treatment SQL has for CHECK constraints
is “reject FALSE,” meaning that both TRUE and UNKNOWN are accepted
Working with Character Data

SQL Server supports two kinds of character data types—regular and


Unicode. Regular data types include CHAR and VARCHAR, and Unicode
data types include NCHAR and NVARCHAR. Regular characters use one
byte of storage for each character whereas Unicode data requires two
bytes per character.

Any data type without the VAR element (CHAR, NCHAR) in its name has a
fixed length, which means that SQL Server preserves space in the row
based on the column’s defined size and not on the actual number of
characters in the character string.

A data type with the VAR element (VARCHAR, NVARCHAR) in its name has
a variable length, which means that SQL Server uses as much storage space
in the row as required to store the characters that appear in the character
string, plus two extra bytes for offset data.

Working with Date and Time Data

JOINS
A JOIN table operator operates on two input tables. The three fundamental
types of joins are cross joins, inner joins, and outer joins. These three
types of joins differ in how they apply their logical query processing
phases; each type applies a different set of phases. A cross join applies only
one phase—Cartesian Product. An inner join applies two phases—Cartesian
Product and Filter. An outer join applies three phases—Cartesian Product,
Filter, and Add Outer Rows.

a cross join is the simplest type of join. A cross join implements only one
logical query processing phase—a Cartesian Product. This phase operates
on the two tables provided as inputs to the join and produces a Cartesian
product of the two.

An inner join applies two logical query processing phases—it applies a


Cartesian product between the two input tables as in a cross join, and then
it filters rows based on a predicate that you specify.

In an outer join, you mark a table as a “preserved” table by using the


keywords LEFT OUTER JOIN, RIGHT OUTER JOIN, or FULL OUTER JOIN
between the table names. The OUTER keyword is optional. The LEFT
keyword means that the rows of the left table are preserved; the RIGHT
keyword means that the rows in the right table are preserved; and the FULL
keyword means that the rows in both the left and right tables are
preserved. The third logical query processing phase of an outer join
identifies the rows from the preserved table that did not find matches in
the other table based on the ON predicate. This phase adds those rows to
the result table produced by the first two phases of the join, and uses NULL
marks as placeholders for the attributes from the nonpreserved side of the
join in those outer rows.
A composite join is simply a join based on a predicate that involves more
than one attribute from each side. A composite join is commonly required
when you need to join two tables based on a primary key–foreign key
relationship and the relationship is composite; that is, based on more than
one attribute.

When a join condition involves only an equality operator, the join is said to
be an equi join. When a join condition involves any operator besides
equality, the join is said to be a non-equi join.

Subqueries
SQL supports writing queries within queries, or nesting queries. The
outermost query is a query whose result set is returned to the caller and is
known as the outer query. The inner query is a query whose result is used
by the outer query and is known as a subquery.

A subquery can be either self-contained or correlated. A self-contained


subquery has no dependency on the outer query that it belongs to,
whereas a correlated subquery does. A subquery can be single-valued,
multivalued, or table-valued. That is, a subquery can return a single value (a
scalar value), multiple values, or a whole table result.

Every subquery has an outer query that it belongs to. Self-contained


subqueries are subqueries that are independent of the outer query that
they belong to.

A scalar subquery is a subquery that returns a single value—regardless of


whether it is self-contained. Such a subquery can appear anywhere in the
outer query where a single-valued expression can appear (such as WHERE
or SELECT).

A multivalued subquery is a subquery that returns multiple values as a


single column, regardless of whether the subquery is self-contained. Some
predicates, such as the IN predicate, operate on a multivalued subquery.

Correlated subqueries are subqueries that refer to attributes from the


table that appears in the outer query. This means that the subquery is
dependent on the outer query and cannot be invoked independently.

Table Expressions
A table expression is a named query expression that represents a valid
relational table.

Microsoft SQL Server supports four types of table expressions: derived


tables, common table expressions (CTEs), views, and inline table-
valued functions (inline TVFs).

A query must meet three requirements to be valid to define a table


expression of any kind:
1. Order is not guaranteed
2. All columns must have names.
3. All column names must be unique.

Derived tables (also known as table subqueries) are defined in the FROM
clause of an outer query. Their scope of existence is the outer query. As
soon as the outer query is finished, the derived table is gone.
Common table expressions (CTEs) are another standard form of table
expression very similar to derived tables, yet with a couple of important
advantages. CTEs are defined by using a WITH statement and have the
following general form.

a view is a reusable table expression whose definition is stored in the


database

Inline TVFs are reusable table expressions that support input parameters.
In all respects except for the support for input parameters, inline TVFs are
similar to views.

You might also like