0% found this document useful (0 votes)
4 views27 pages

SQL Fundamentals

The document provides a comprehensive overview of SQL fundamentals, including syntax, data types, operators, and NULL handling. It covers database objects such as tables, keys, indexes, and views, as well as data manipulation and definition languages. Additionally, it discusses data control language, transaction management, isolation levels, and performance tuning techniques.

Uploaded by

mohdhanzala542
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views27 pages

SQL Fundamentals

The document provides a comprehensive overview of SQL fundamentals, including syntax, data types, operators, and NULL handling. It covers database objects such as tables, keys, indexes, and views, as well as data manipulation and definition languages. Additionally, it discusses data control language, transaction management, isolation levels, and performance tuning techniques.

Uploaded by

mohdhanzala542
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 27

1.

SQL Fundamentals:

o SQL syntax and basic commands (SELECT, INSERT, UPDATE, DELETE)

SELECT * FROM table_name

INSERT INTO table_name(age, name) VALUES ( 24, “khan”),(18,”racy”)

UPDATE table_name SET age = 18, name = ‘john’ WHERE name =”khan”

DELETE FROM table_name WHERE age = 18

o Data types (INT, VARCHAR, DATE, etc.)

There are multiple data types are available in SQL


Which are:

1. INT

2. BIGINT

3. SMALLINT

4. TINYINT

5. DECIMAL or NUMERIC

6. FLOAT

7. DOUBLE

8. REAL

9. CHAR it is fixed length

10. VARCHAR it can vary the length accordingly, generally used when you know length can
vary.

11. TEXT

12. DATE

13. TIME

14. DATETIME15. TIMESTAMP16. YEAR17. BINARY18. VARBINARY19. BLOB20. BOOLEAN or


BOOL21. ENUM22. JSON23. XML24. GEOMETRY25. POINT26. LINESTRING27. POLYGON28.
UUID29. ARRAY

30. SET
o Operators (arithmetic, comparison, logical)

Arithmetic operators are + - % * comparison operators are = != < > <= >= <> and logical
operators are AND, OR, UNION, UNION ALL, NOT

o NULL values and handling

NULL is special in SQL, it doesn’t mean 0 or an empty string instead it indicates the abscess
of the value altogether.

NULL can’t be treated as normal, it has it’s own set of rules,

We can’t compare NULL with normal logical operator, instead we use “IS” to compare the
NULL values, IS NULL to check the value is equal to NULL and we use IS NOT NULL to check
the value is not equal to NULL.

When it’s come about arithmetic operators, whenever any arithmetic operator is used with
NULL the result will be NULL example:- 5+NULL = NULL

When it comes to aggregate functions, all the functions ignore the NULL values, except
COUNT(), count counts the NULL as well

When it comes about logical operators, NULL is neither TRUE or FALSE, it’s NULL so the
result is NULL, example NULL AND TRUE == NULL

Handling the NULL value:-

Handling null value is essential while designing the databases,

-> COALESCE() function is used to handle the NULL value, we can


replace the NULL value with any customized value we want, like:- COALESCE(col_1,0)

NULLIF():- this function returns NULL if col-1 and col-2 both are equal else return col-1 if it’s
not equal, generally they are used to handle the division by 0 error.

Example:- NULLIF(col1,0) if col1 is equal to 0 then the result will be NULL else the result will
be the value in col1.
2. Database Objects:

o Tables and their structure

Tables are the fundamental of any database, it is structured form of data which are stored in
tables and this tables are connected with each other with some relation. Tables are consist
of rows and columns, rows are the horizontal and also sometimes call as records or tuples.
Columns are verticals and also known as attributes or fields.

o Primary keys and foreign keys (all other keys)

Primary Key is a column or set of columns which are used to uniquely identify the each
record of table. The values of primary key is unique and cannot be NULL.

Foreign Key is a column or a set of column which is used as primary key for another table
and is used to refer the other table

FOREIGN KEY (customer_id) REFERENCES customers(customer_id).

Unique Key is similar to primary key but it can contain one NULL value.

Composite Key, it is a combination of two or more column which can be used as primary key,
foreign key or unique key.

Alternate Key, it is also a unique value column but it is not used as primary key but it has
potential to be primary key.

Surrogate Key, it is auto incremental, artificial generated key, which helps us in uniquely
determine the record. It is a user created primary key, it is not present in data naturally.

Candidate Key, is a minimal super key which can uniquely identify the record.

Super Key, is super set of candidate key, it the combination of columns which can extract the
unique record.

o Indexes and their types (B-tree, hash, etc.)

There are many types of indexes are present in SQL, like clustered, non-clustered, bitmap
index, full text index, hash index, unique index, etc.

But mainly there are 2 types of indexes

Clustered:- it is sorted and stored in sorted physical form, it is generally created on primary
key, the leaf nodes contains the data, it is one per table, it is faster than the non-clustered
indexes. Generally don’t take any new memory for the storage.
Non-clustered:- it is created on a column, it has the pointers to data, it generally uses the B+
tree for storage, it can be applied on multiple column, a new memory is allocated to non-
clustered indexes.

o Views and materialized views

Views are the concept in SQL, views are created on the top of tables for limited access and
to display limited data, to make sure data security and privacy.

Materialized Views:- It is unlike normal views, the result is physically stored on disk, it helps
in retrieving frequently accessed data.

o Sequences and identity columns

Sequences generate auto incremental numbers with certain interval

Example:-

CREATE SEQUENCE employee_seq

START WITH 1

INCREMENT BY 1;

Identity columns:-

It also generate automatically when we insert any value in table

Example:-

INSERT INTO employees (employee_id, name)

VALUES (employee_seq.NEXTVAL, 'John Doe');

o Schemas and databases

Schema is the blueprint of table, it shows what attributes table will contain what will be the
primary key, what will be foreign key, and other aspects.

3. Data Manipulation Language (DML):

o Inserting, updating, and deleting data


Inserting:-

INSERT INTO table_name (col_1,col_2) VALUES(1, “pranjal”),(2,”raju”)

Updateting:-

UPDATE table_name SET name = “Pranjal” WHERE name = “Raju”

DELETE FROM table_name WHERE name = “Raju”

o SELECT statement and its clauses (WHERE, GROUP BY, HAVING, ORDER BY)

SELECT statement is used to retrieve the data, we use different clauses to retrieved the
desired data,

WHERE:- This clause is used to check the condition, it can not be used with aggregate
functions. It used to filter out the data.

GROUP BY:-

Group by is used to make group of records on the basis of a column or a set of columns. It is
generally used with the aggregate functions.

HAVING:-

Having clause is used to compare the aggregation result, it is generally used with GROUP BY.

ORDER BY:-

Order by helps us in arranging the records on the basis or a column or a group of columns.

FROM:-

Helps in deciding from which table we have take out the data.

LIMIT/OFFSET:-

Limit helps in limiting the extracted data, and display limited data.

OFFSET helps in skipping the records.

o Joining tables (INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL OUTER JOIN, SELF JOIN)

INNER JOIN:- Inner join helps in joining two tables with common key, it only retrieve the
records which are in both of the tables.

LEFT JOIN:- Left joins takes are the data from the left table and matching data from the right
table, all the places where it is unable to find data in right table it will fill NULL in it.
RIGHT JOIN:- Right join is anti of left join, it takes all the matching data from the left table
and fill null if it unable to find the data in left table.

FULL OUTER JOIN:- Full outer join takes the all the data from both the data it is union of
Right and Left join.

SELF JOIN:- joining table with itself it called self join, generally used when we want to check
the redords of same table and they have any relation.

o Subqueries (correlated and non-correlated)

Subquery is the concept in SQL, subquery take out the data and then the main query is run
over the data that is retrieved by the subquery.

Correlated Subquery:- This query is dependent on main query to get executed.

Non-Correlated Subquery:- This query is independent and doesn’t depend on outer query
for it’s execution.

o Aggregate functions (COUNT, SUM, AVG, MIN, MAX,FIRST,LAST)

COUNT() it helps in counting

SUM() it gives the SUM of a column

AVG() helps us in taking out the average of a column.

MIN() helps us in determining the minimum value among the column

MAX() helps us in determining the maximum value among the column.

FIRST() returns the first value of the column

LAST() returns the last value of the column

o Window functions (RANK, DENSE_RANK,


ROW_NUMBER,LAG(),LEAD(),FIRST_VALUE(),LAST_VALUE(),NTILE() etc.)

ROW_NUMBER helps in assigning the row number in a table,

EX:- SELECT name, department, salary, ROW_NUMBER() OVER(ORDER BY salary desc) AS


salary_num FROM employee
It is 1,2,3,4,5,6,7…

DENSE_RANK() helps us in ranking the column in better way it doesn’t skip any number so if
something is same it will be like 1,1,2,2,3,4,5

SELECT name,department,salary, DENSE_RANK() OVER(ORDER BY salary DESC) as


salary_rank FROM employee

RANK() helps in ranking the column in skipping the ranks for repetitive data, like:-

1,1,3,3,5

SELECT name,department,salary, RANK() OVER(ORDER BY salary DESC) as salary_rank FROM


employee.

LAG() functions delay the column by given input, by default it’s 1.

SELECT name,department,salary, LAG(salary,2) OVER(ORDER BY salary DESC) as


lagged_salary FROM employee.

LEAD() functions give the next column value ahead, like I have salary day by day, by lead I
can see tomorrows salary today.

FIRST_VALUE(), it gives the first value of every partition that we have done.

LAST_VALUE(), it gives the last value of every partition that we have done.

NTILE(), it divides the partitions into a specified number of groups.

o Common Table Expressions (CTEs)

CTEs are temporary tables which are generally used to retrieve the temporary data and do
operation on top of it. They use WITH key word

Example:-

WITH cte AS (SELECT * FROM table_name)

o Recursive queries

Recursive queries uses the data from the query itself, it is query calling query

4. Data Definition Language (DDL):

o Creating, altering, and dropping database objects

Creating:-

CREATE TABLE table_name(


Col_1 INT PRIMARY KEY ,

Col_2 VARCHAR UNIQUE,

Col_3 CHAR,

Altering:-

ALTER TABLE table_name

ADD column_name datatype constraint;

Dropping:-

DROP TABLE table_name

o Table constraints (PRIMARY KEY, FOREIGN KEY, UNIQUE, CHECK)

Primary key , UNIQUE and Foreign Key is already discussed,

Check:- check is used to check the condition before doing any operation, we can give the
condition before inserting or updating the data.

o Default values and column constraints

DEFAULT keyword helps in setting up the default value, we can use the DEFAULT word while
inserting the data, we can set the default value if we don’t find any related value for that
column.

Column constraints we have talked already.

o Temporary tables and table variables

Temporary tables are created to store the data temporary, within the sessions we use “#” for
local access of table and “##” for global access.

CREATE TABLE #temp_employees (

employee_id INT,

name VARCHAR(100),
department VARCHAR(50)

);

Table variables:-

It is kind of similar to temporary table but they have differences where they are declared,
generally declared in stored procedure and user functions.

DECLARE @temp_products TABLE (

product_id INT,

product_name VARCHAR(100),

price DECIMAL(10, 2)

);

5. Data Control Language (DCL):

o GRANT and REVOKE statements

GRANT is used to give privileges to users and roles

REVOKE is used to delete the granted privileges to users and roles.

The GRANT statement is used to give specific privileges to users or roles on database
objects.

The REVOKE statement is used to remove previously granted privileges from users or roles.

-- GRANT

GRANT <privilege(s)> ON <object> TO <user/role> [WITH GRANT OPTION];

-- REVOKE

REVOKE <privilege(s)> ON <object> FROM <user/role>;

CREATE ROLE sales_rep;

CREATE ROLE manager;

GRANT SELECT, INSERT ON orders TO sales_rep;


GRANT SELECT, INSERT, UPDATE ON orders TO manager;

GRANT SELECT ON sensitive_data TO manager;

o User and role management

USER is the account which used to interact with the database, and can do various actions.
USER can be created with CREATE USER, can alter user with ALTER USER and delete the USER
with DROP USER.

ROLE is the collection of privileges that can be granted to users or other rules

It is created with the help of CREATE ROLE, altered with ALTER ROLE and deleted with DROP
ROLE

o Privileges

Privileges are the actions that a user or role can perform on the database

Examples are SELECT, UPDATE, DELETE, INSERT etc.

6. Transactions and Concurrency:

o ACID properties (Atomicity, Consistency, Isolation, Durability)

ACID stands for Atomicity, Consistency, Isolation, Durability it is very important for any
transactional database, let’s understand each one by one

Let’s understand one by one

1. Atomicity:- Atomicity means the unit of work should be take place as a whole, either
the operation will be done in one shot or it will get fail, but all the operation will
happen together.

2. Consistency:- Consistency means the database should remain consistent after the
transaction, it ensure data remain consistent after the transaction

3. Isolation:- Isolation means every transaction should perform in isolation, the one
transaction should not affect the other transaction.

4. Durability:- Once the transaction is done, then the changes are permanent.
o Transaction control statements (BEGIN, COMMIT, ROLLBACK, SAVEPOINT)

Begin:- Marks the Beginning of Transaction

Commit:- End of a transaction

Rollback:- Cancels a transaction

Savepoint:- Create a named marks in a transaction.

BEGIN TRANSACTION; -- Start the transaction

UPDATE employees SET salary = salary * 1.05 WHERE department = 'Sales';

DELETE FROM products WHERE discontinued = true;

SAVEPOINT before_insert; -- Create a savepoint

INSERT INTO products (product_name, price) VALUES ('New Product', 99.99);

-- Some error occurs...

ROLLBACK TRANSACTION before_insert; -- Rollback to the savepoint

COMMIT TRANSACTION; -- Commit the remaining changes

o Isolation levels (READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ,


SERIALIZABLE)

Isolation Levels (Short Form)

Level Description Pros Cons

READ Can read Highest Least data


UNCOMMITTED uncommitted changes
(dirty reads). performance consistency

READ Can only read Good balance Non-repeatable &


COMMITTED committed data. phantom reads
possible

REPEATABLE Reads a consistent Stronger Phantom reads


READ snapshot, avoiding consistency still possible
non-repeatable reads.

SERIALIZABLE Transactions run as if Highest Lowest


one after the other. consistency performance

Choosing:

7. Need speed, can tolerate inconsistencies? READ UNCOMMITTED

8. Need some consistency, good performance? READ COMMITTED

9. Need stronger consistency, OK with some constraints? REPEATABLE READ

10. Absolute data integrity is critical? SERIALIZABLE

o Locks and lock types (shared, exclusive, update, intent)

There are locking system in a transactional databases, this make a system that how will
transactions will interact with database for the same data simultaneously.

They help in avoiding conflicts and ensure Data integrity.

Different locks have different responsibilities, shared lock allows read only, exclusive allow
read and write both, update allow to update the data and intent used as hierarchy lock
system.

o Deadlocks and deadlock detection

Deadlock is when both the transactions are trying to use same data at the same time, there
are different types of mechanisms are there to detect the deadlock,
Like:- Wait for Graph and Timeout

11. Indexing and Performance Tuning:

o Index types and their use cases

Indexes helps us to retrieve the data from the database fast, we create indexes on top of the
database, There are mainly 2 types of indexes are there.

o Clustered and non-clustered indexes

Clustered:- It is the physical index stored on the disk, there is only 1 clustered index, it is
stored in physical sequence, it has directly access to data.

Non Clustered:- They are build on top of a column ot multiple column it uses the B+ tree
data structure to store the indexes, it is bit slower then the clustered index as it it contains
the pointer to the data.

CREATE INDEX name_of_index ON Employees (LastName);

o Covering indexes

Covering indexes creates extra column inside the index it helps query to retrieve the data
which is in index and in main table.

o Query optimization techniques

Query optimization can be done with below ways:-

1. Indexing on database

2. Rewriting the Queries

3. Statical Analysis

4. Table Design and Management

5. Execution plan and profiling


6. Stored Procedures and temporary tables

Indexing, creating indexes over the database helps query to retrieve the data
fast, although indexing takes extra space but when it comes to retrieval it is really fast.

We have to sometime rewrite the queries, optimizing the joins and restricting the data
movement as much as possible. Using of efficient joins are better, using inner, instead of left
join if applicable, also using of joins instead of subqueries helps in performing queries better,
Also don’t just go blindly with * try to select limited useful columns.

Statical Analysis means DBMS collects the statistics about the data distributed in database
on basis of this statistics the query takes the best decision on execution, maintaining
updated statistics and giving hints can improve performance.

Table Designing at the starting state is important, maintaining the correct datatypes are
important, normalization helps in reducing the redundancy.

Execution plan and profiling, With “Explain” we can check the execution plan of whole query
and can optimize the bottle necks.

Stored Procedures are precompiled SQL statements that are stored and executed in
database, it helps in fast retrieval of data. Helps in reducing the network traffic.

Temporary table is created on disk, this helps in fast retrieval of frequently used data.

o Execution plans and their interpretation

With Explain with can see whole execution plan that DBMS has prepared, the joins the cost
everything, with the help of explain we optimize the query.

o Statistics and cardinality estimation

Cardinality estimation is a process of predicting the the number of rows will be returned in a
particular step in query plan, query optimizer take this plan and try to predict the most
efficient way of query execution

o Partitioning and sharding

Partitioning is dividing the data on basis of some key, it creates the chunks of data which
helps in parallel processing.

Sharding is horizontal scaling, distributing data to every database server which is called
shard.
12. Stored Procedures and Functions:

o Creating and executing stored procedures

CREATE PROCEDURE procedure_name

AS

sql_statement

GO;

CREATE PROCEDURE GetCustomerInfo @CustomerID INT

AS

SELECT * FROM Customers WHERE CustomerID = @CustomerID

GO;

o Input and output parameters

The parameters that are given to stored procedure is Input parameter and the output given
by stored procedures is output parameters

o User-defined functions (scalar, table-valued)

User defined functions or UDFs are customized functions created by user, we can take the
advantage of already present functions but with the help of UDFs we can actually extend the
SQL power,
There are two types of UDFs:-

Scalar, this function return only single value of any data type.

SQL

CREATE FUNCTION CalculateAge (@BirthDate DATE)

RETURNS INT

AS
BEGIN

RETURN DATEDIFF(YEAR, @BirthDate, GETDATE());

END;

Table-valued functions return a table as their output, they are similar to views but can take
the parameters.

CREATE FUNCTION GetCustomersByRegion (@Region NVARCHAR(50))

RETURNS TABLE

AS

RETURN

SELECT * FROM Customers WHERE Region = @Region

);

o Error handling and exception management

Error handling is important in SQL, we can handle the SQL errors in TRY CATCH block,

BEGIN TRY

-- SQL code that might raise an error

END TRY

BEGIN CATCH

-- Error handling code (e.g., logging, custom error messages)

SELECT

ERROR_NUMBER() AS ErrorNumber,

ERROR_SEVERITY() AS ErrorSeverity,

ERROR_STATE() AS ErrorState,

ERROR_PROCEDURE() AS ErrorProcedure,

ERROR_LINE() AS ErrorLine,

ERROR_MESSAGE() AS ErrorMessage;
END CATCH;

13. Triggers and Events:

Triggers

 What they are: Special stored procedures that automatically execute in response to
specific events on a table or view (e.g., INSERT, UPDATE, DELETE).

 Why they're used:

o Enforcing complex business rules or data integrity constraints.

o Auditing data changes (logging who changed what and when).

o Maintaining relationships between tables (e.g., updating a summary table


when details change).

o Performing custom actions based on data modifications.

Types of Triggers

 DML Triggers:

o FOR (or AFTER): Executes after the triggering event (INSERT, UPDATE,
DELETE). You can access the affected rows through special tables like
INSERTED (new values) and DELETED (old values).

o INSTEAD OF: Executes in place of the triggering event. Useful for customizing
the behavior of views or tables with constraints.

 DDL Triggers: Triggered by data definition language events (e.g., CREATE, ALTER,
DROP). Often used for auditing database schema changes.

 Logon Triggers: Triggered when a user logs into the database.

Creating and Managing Triggers (Example in SQL Server)

SQL

o CREATE TRIGGER trg_UpdateCustomerAudit


o ON Customers

o AFTER UPDATE

o AS

o BEGIN

o INSERT INTO CustomerAudit (CustomerID, OldData, NewData, ModifiedBy,


ModifiedDate)

o SELECT d.CustomerID, d.Name + ' ' + d.Email, i.Name + ' ' + i.Email, SYSTEM_USER,
GETDATE()

o FROM DELETED d

o JOIN INSERTED i ON d.CustomerID = i.CustomerID;

o END;

Use code with caution.

content_copy

 ALTER TRIGGER: Modify an existing trigger.

 DROP TRIGGER: Delete a trigger.

 sp_helptrigger: View information about triggers on a table.

Scheduled Events and Jobs (SQL Server Agent)

 SQL Server Agent: A component of SQL Server that lets you schedule and automate
administrative tasks.

 Jobs: A series of steps that can include SQL scripts, operating system commands, or
integration services packages.

 Schedules: Define when jobs should run (e.g., daily, weekly, on a specific date/time).

 Benefits:

o Automate backups, maintenance, data processing, report generation, etc.

o Run tasks during off-peak hours.

o Ensure critical processes happen consistently.

Creating a Job in SQL Server Agent

1. Open SQL Server Management Studio (SSMS).


2. Expand the "SQL Server Agent" node.

3. Right-click "Jobs" and choose "New Job."

4. Give the job a name and description.

5. Add steps (e.g., T-SQL code, SSIS packages).

6. Configure the schedule (e.g., daily at 3 AM).

Important Notes:

 Overuse: Triggers can introduce complexity and performance overhead if overused.


Use them judiciously.

 Recursion: Be careful to avoid infinite loops if triggers can modify data that then
triggers other triggers.

 Testing: Thoroughly test triggers before deploying them to production.

 Error Handling: Implement robust error handling within triggers to prevent issues.

Let me know if you'd like to explore any of these concepts in more detail or see examples in
a specific SQL dialect!

tuneshare

more_vert

14. Data Warehousing and Business Intelligence:

o Star and snowflake schemas

o Fact and dimension tables

o Slowly changing dimensions (SCD)

o ETL processes and tools

o OLAP and data cube operations

You've opened a treasure trove of data warehousing and business intelligence (BI) topics!
Let's break them down:

Star and Snowflake Schemas

 Star Schema: The foundational design for data warehouses. It consists of:

o Fact Table: Stores the core measurements or metrics of your business (e.g.,
sales amount, quantity sold).

o Dimension Tables: Store descriptive attributes related to the facts (e.g.,


product, time, customer).
o Relationships: The fact table has a foreign key relationship to each dimension
table.

 Snowflake Schema: An extension of the star schema where dimension tables are
normalized (broken down into smaller tables) to reduce redundancy. This can
improve data quality and save storage space, but it can also make queries more
complex.

Fact and Dimension Tables

 Fact Table: The heart of the data warehouse. Contains:

o Numerical Facts: The measurable values you want to analyze (e.g., sales,
costs, inventory).

o Foreign Keys: References to dimension tables that provide context for the
facts.

o Grain: The level of detail captured in the fact table (e.g., daily sales, individual
transactions).

 Dimension Table: Provides descriptive attributes for the facts. Contains:

o Primary Key: A unique identifier for each dimension member.

o Attributes: Textual or descriptive fields (e.g., product name, category,


customer name, address).

Slowly Changing Dimensions (SCD)

 The Problem: How to track changes in dimension attributes over time (e.g., a
customer's address or a product's price).

 SCD Types:

o Type 1: Overwrite the old value with the new one (no history).

o Type 2: Create a new row for the changed dimension member with a new
start date and a version number.

o Type 3: Keep the current value and a previous value in separate columns.

o Type 4: Use a history table to track all changes.

ETL Processes and Tools

 ETL: Extract, Transform, Load. The process of getting data from source systems,
cleaning and transforming it, and loading it into the data warehouse.

 ETL Tools: Help automate and streamline the ETL process (e.g., SQL Server
Integration Services (SSIS), Informatica PowerCenter, Talend Open Studio).
OLAP and Data Cube Operations

 OLAP: Online Analytical Processing. A technology for analyzing multidimensional


data.

 Data Cube: A conceptual model for representing multidimensional data.

 OLAP Operations:

o Slice: Select a specific subset of data based on dimension values (e.g., sales
for a particular product in a specific region).

o Dice: Select a subcube of data by applying multiple filters.

o Drill Down/Up: Explore data at different levels of granularity (e.g., from


yearly sales to monthly or weekly sales).

o Roll-up: Aggregate data along one or more dimensions (e.g., total sales across
all regions).

o Pivot: Rotate the data cube to view data from different perspectives.

Let me know if you'd like a deeper dive into any of these areas, specific examples, or
guidance on choosing the right tools and techniques!

15. Advanced SQL Techniques:

o Window functions and analytic queries

o Hierarchical queries and recursive CTEs

o Pivot and unpivot operations

o Dynamic SQL and SQL injection prevention

o Full-text search and indexing

You've hit upon some of the most powerful and versatile features in modern SQL! Let's
explore these advanced techniques:

1. Window Functions and Analytic Queries

 What they are: Functions that operate on a set of rows (a "window") relative to the
current row within a query result. They allow you to perform calculations across rows
without grouping them.

 Common Window Functions:

o ROW_NUMBER(), RANK(), DENSE_RANK(): Ranking rows.


o LEAD(), LAG(): Accessing values from preceding or following rows.

o SUM(), AVG(), COUNT(), MIN(), MAX() (with OVER clause): Aggregate


functions over windows.

 Benefits:

o Moving averages, cumulative sums, running totals.

o Ranking data within groups or partitions.

o Comparing values to other values within a window.

 Example (SQL Server):

 SQL

SELECT

EmployeeID,

Salary,

AVG(Salary) OVER (PARTITION BY DepartmentID) AS AvgDeptSalary

FROM Employees;

2. Hierarchical Queries and Recursive CTEs

 What they are: Queries that deal with data organized in a hierarchical or tree-like
structure (e.g., employee hierarchies, product categories).

 Recursive CTEs: A type of Common Table Expression (CTE) that references itself to
iterate through the hierarchical data.

 Benefits:

o Traversing parent-child relationships.

o Finding all descendants or ancestors of a node.

o Calculating hierarchical totals.

 Example (SQL Server):

 SQL

WITH EmployeeHierarchy AS

(
SELECT EmployeeID, ManagerID, Name

FROM Employees

WHERE ManagerID IS NULL -- Start with top-level employees

UNION ALL

SELECT e.EmployeeID, e.ManagerID, e.Name

FROM Employees e

INNER JOIN EmployeeHierarchy eh ON e.ManagerID = eh.EmployeeID

SELECT * FROM EmployeeHierarchy;

3. Pivot and Unpivot Operations

 Pivoting: Rotating rows into columns, turning unique values in a column into new
column headers.

 Unpivoting: Doing the opposite, rotating columns back into rows.

 Benefits:

o Reshaping data for better reporting or analysis.

o Making data more suitable for certain visualization tools.

 Example (SQL Server):

 SQL

SELECT *

FROM Sales

PIVOT

SUM(Amount)

FOR Region IN ([North], [South], [East], [West])


) AS PivotTable;

4. Dynamic SQL and SQL Injection Prevention

 Dynamic SQL: Constructing SQL statements at runtime using variables or parameters.

 SQL Injection: A security vulnerability where malicious code is inserted into SQL
statements.

 Prevention:

o Parameterized Queries: Use parameters (@param) instead of directly


concatenating user input into SQL strings.

o Stored Procedures: Encapsulate dynamic SQL in stored procedures.

o Input Validation: Always validate and sanitize user input.

5. Full-Text Search and Indexing

 What it is: Specialized indexing and search engine capabilities within a database.

 Benefits:

o Fast search on large text fields (e.g., product descriptions, documents).

o Relevance ranking based on search terms.

o Linguistic features (stemming, synonyms).

 Available in: SQL Server (Full-Text Search), PostgreSQL (full-text search), MySQL
(FULLTEXT indexes).

Let me know if you'd like to dive deeper into any of these advanced techniques! We can
explore specific examples, implementation details, or best practices based on your needs.

16. SQL Security:

o Authentication and authorization

o Encryption and data masking

o SQL injection vulnerabilities and prevention

o Auditing and logging


You've zeroed in on some of the most crucial aspects of SQL security. Let's delve deeper:

1. Authentication and Authorization

 Authentication: Verifying the identity of a user or process attempting to access the


database.

o Methods:

 SQL Server Authentication: Users provide a username and password


managed by SQL Server.

 Windows Authentication: Users authenticate with their Windows


credentials.

 Mixed Mode: Supports both SQL Server and Windows authentication.

 Authorization: Determining what actions a user or process is allowed to perform on


database objects (e.g., tables, views, procedures).

o Permissions: Granted at the object level (e.g., SELECT, INSERT, UPDATE,


DELETE) or through roles (groups of permissions).

2. Encryption and Data Masking

 Encryption: Transforming data into an unreadable format to protect it from


unauthorized access.

o Types:

 Transparent Data Encryption (TDE): Encrypts the entire database at


rest (data and log files).

 Column-Level Encryption: Encrypts specific columns within a table.

 Encryption Functions: Encrypt and decrypt data within queries using


functions like ENCRYPTBYPASSPHRASE() (SQL Server) or
PGP_SYM_ENCRYPT() (PostgreSQL).

 Data Masking: Obfuscating sensitive data (e.g., credit card numbers, social security
numbers) by replacing it with realistic but fictitious values.

3. SQL Injection Vulnerabilities and Prevention

 SQL Injection: A malicious attack where attackers inject harmful SQL code into input
fields, potentially exposing or modifying sensitive data.

 Prevention:

o Parameterized Queries: Use parameters (@param) to separate data from SQL


code, preventing attackers from manipulating queries.
o Stored Procedures: Encapsulate SQL code in stored procedures to limit the
attack surface.

o Input Validation: Validate and sanitize all user input to ensure it conforms to
expected patterns.

o Least Privilege: Grant users and applications only the minimum permissions
they need.

4. Auditing and Logging

 Auditing: The process of tracking and recording database activity for security,
compliance, or troubleshooting purposes.

o SQL Server Audit: Built-in feature in SQL Server to track events like logins,
schema changes, and data access.

o Trigger-Based Auditing: Create triggers on tables to log changes in a separate


audit table.

o Third-Party Tools: Use specialized tools for more comprehensive auditing and
reporting.

 Logging: Recording information about errors, warnings, or other events in the


database engine or application.

General Security Best Practices

 Regularly Update and Patch: Keep your database software and operating system up
to date with the latest security patches.

 Strong Passwords: Enforce strong password policies for all database users.

 Limit Network Access: Restrict access to the database server from external
networks.

 Firewalls: Use firewalls to control incoming and outgoing traffic to the database
server.

 Regular Backups: Maintain regular backups of your data to protect against data loss
due to attacks or system failures.

 Security Training: Educate users and developers about security risks and best
practices.

Let me know if you have any more specific questions or want to explore a particular security
topic in more depth!

Most asked queries:-


Select only unique elements

SELECT a1.id, a1.city_name, a1.pin_code

FROM addresses a1

INNER JOIN addresses a2 ON a1.city_name = a2.city_name AND a1.pin_code = a2.pin_code

WHERE a1.id < a2.id;

You might also like