0% found this document useful (0 votes)
4 views

Python SQliteModule

SQLite is a lightweight, serverless relational database management system that requires minimal setup and is self-contained. It supports dynamic data types, ACID-compliant transactions, and allows for easy integration into various applications without the need for configuration. Key features include the ability to create in-memory databases, use of a unique rowid for table entries, and support for various constraints such as NOT NULL, UNIQUE, and CHECK.

Uploaded by

semke812
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Python SQliteModule

SQLite is a lightweight, serverless relational database management system that requires minimal setup and is self-contained. It supports dynamic data types, ACID-compliant transactions, and allows for easy integration into various applications without the need for configuration. Key features include the ability to create in-memory databases, use of a unique rowid for table entries, and support for various constraints such as NOT NULL, UNIQUE, and CHECK.

Uploaded by

semke812
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

Unit 1: Introduction to SQLite advantages, features and Fundamentals

What is SQLite?
SQLite is a software library that provides a relational database management
system. The lite in SQLite means lightweight in terms of setup, database
administration, and required resources. It is a software library that implements a
self-contained, serverless, zero-configuration, transactional SQL database
engine.
Serverless
Normally, an RDBMS such as MySQL, PostgreSQL, etc., requires a separate
server process to operate. The applications that want to access the database server
use TCP/IP protocol to send and receive requests. This is called client/server
architecture.

SQLite does NOT work this way.


SQLite does NOT require a server to run.
SQLite database is integrated with the application that accesses the database. The
applications interact with the SQLite database read and write directly from the
database files stored on disk.

Self-Contained
SQLite is self-contained means it requires minimal support from the operating
system or external library. This makes SQLite usable in any environment
especially in embedded devices like iPhones, Android phones, game consoles,
handheld media players, etc.
SQLite is developed using ANSI-C. The source code is available as a big
sqlite3.c and its header file sqlite3.h. If you want to develop an application that
uses SQLite, you just need to drop these files into your project and compile it
with your code.
Zero-configuration
Because of the serverless architecture, you don’t need to “install” SQLite before
using it. There is no server process that needs to be configured, started, and
stopped.
In addition, SQLite does not use any configuration files.
Transactional
All transactions in SQLite are fully ACID-compliant. It means all queries and
changes are Atomic, Consistent, Isolated, and Durable,
ACID is a set of principles that ensure database transactions are processed
reliably. When any data storage system upholds those principles, it is said to be
ACID compliant.
In other words, all changes within a transaction take place completely or not at
all even when an unexpected situation like application crash, power failure, or
operating system crash occurs.
SQLite distinctive features
SQLite uses dynamic types for tables. It means you can store any value in any
column, regardless of the data type.
SQLite allows a single database connection to access multiple database files
simultaneously. This brings many nice features like joining tables in different
databases or copying data between databases in a single command.
SQLite is capable of creating in-memory databases that are very fast to work with.
Download SQLite tools
To download SQLite, you open the download page of the SQlite official website.
First, go to the https://www.sqlite.org website.
Second, open the download page https://www.sqlite.org/download.html
SQLite provides various tools for working across platforms e.g., Windows,
Linux, and Mac. You need to select an appropriate version to download.

Installing SQLite is simple and straightforward.


First, create a new folder e.g., C:\sqlite.
Second, extract the content of the file that you downloaded in the previous section
to the C:\sqlite folder.
SQLite Commands
To start the sqlite3, you type the sqlite3
By default, an SQLite session uses the in-memory database, therefore, all changes
will be gone when the session ends.
To open a database file, you use the .open FILENAME
Here FileName is Path of Database File
If you start a session with a database name that does not exist, the sqlite3 tool will
create the database file.
Show all available commands and their purposes
To show all available commands and their purpose, you use the .help command
Show databases in the current database connection
To show all databases in the current connection, you use the .databases command.
The .databases command displays at least one database with the name: main.
To add an additional database in the current connection, you use the statement
ATTACH DATABASE.
ATTACH DATABASE PATH AS ALIAS;

Exit sqlite3 tool


To exit the sqlite3 program, you use the .exit command.

Introduction to SQLite data types


Database systems such as MySQL and PostgreSQL, use static types. It means
when you declare a column with a specific data type, that column can store only
data of the declared data type.
Different from other database systems, SQLite uses dynamic type system. In
other words, a value stored in a column determines its data type, not the column’s
data type.
you don’t have to declare a specific data type for a column when you create a
table. In case you declare a column with the integer data type, you can store any
kind of data types such as text and BLOB, SQLite will not complain about this.
SQLite provides five primitive data types which are referred to as storage classes.
Storage classes describe the formats that SQLite uses to store data on disk. A
storage class is more general than a data type.
Storage Class Meaning
NULL NULL values mean missing information or unknown.
INTEGER Integer values are whole numbers (either positive or negative).
An integer can have variable sizes such as 1, 2,3, 4, or 8 bytes.
REAL Real values are real numbers with decimal values that use
8-byte floats.
TEXT TEXT is used to store character data. The maximum length of
TEXT is unlimited. SQLite supports various character
encodings.
BLOB BLOB stands for a binary large object that can store any kind
of data. The maximum size of BLOB is, theoretically,
unlimited.
SQLite determines the data type of a value based on its data type according to the
following rules:
If a literal has no enclosing quotes and decimal point or exponent, SQLite assigns
the INTEGER storage class.
If a literal is enclosed by single or double quotes, SQLite assigns the TEXT
storage class.
If a literal does not have quote nor decimal point nor exponent, SQLite assigns
REAL storage class.
If a literal is NULL without quotes, it assigned NULL storage class.
If a literal has the X’ABCD’ or x ‘abcd’, SQLite assigned BLOB storage class.
SQLites provides the typeof() function that allows you to check the storage class
of a value based on its format.
SQLite Affinity Type
SQLite supports the concept of type affinity on columns. Any column can still
store any type of data but the preferred storage class for a column is called its
affinity.
SQLite Affinity and Type Names
Data Type Affinity
INT
INTEGER
TINYINT
SMALLINT
MEDIUMINT INTEGER
BIGINT
UNSIGNED BIG INT
INT2
INT8
CHARACTER(20)
VARCHAR(255)
VARYING CHARACTER(255)
NCHAR(55) TEXT
NATIVE CHARACTER(70)
NVARCHAR(100)
TEXT
CLOB
BLOB NONE
no datatype specified
REAL
DOUBLE REAL
DOUBLE PRECISION
FLOAT
NUMERIC
DECIMAL(10,5)
BOOLEAN NUMERIC
DATE
DATETIME

Boolean Datatype
SQLite does not have a separate Boolean storage class. Instead, Boolean values
are stored as integers 0 (false) and 1 (true).
Date and Time Datatype
SQLite does not have a separate storage class for storing dates and/or times, but
SQLite is capable of storing dates and times as TEXT, REAL or INTEGER
values.
Sr.No. Storage Class & Date Format
1 TEXT
A date in a format like "YYYY-MM-DD HH:MM:SS.SSS"
2 REAL
The number of days since noon in Greenwich on November 24, 4714
B.C.
3 INTEGER
The number of seconds since 1970-01-01 00:00:00 UTC
SQLite CREATE TABLE statement
To create a new table in SQLite, you use CREATE TABLE statement using the
following syntax:
CREATE TABLE [IF NOT EXISTS] [schema_name].table_name (
column_1 data_type PRIMARY KEY,
column_2 data_type NOT NULL,
column_3 data_type DEFAULT 0,
table_constraints
) [WITHOUT ROWID];
✓ First, specify the name of the table that you want to create after the
CREATE TABLE keywords. The name of the table cannot start with
sqlite_ because it is reserved for the internal use of SQLite.

✓ Second, use IF NOT EXISTS option to create a new table if it does not
exist. Attempting to create a table that already exists without using the IF
NOT EXISTS option will result in an error.

✓ Third, optionally specify the schema_name to which the new table belongs.
The schema can be the main database, temp database or any attached
database.

✓ Fourth, specify the column list of the table. Each column has a name, data
type, and the column constraint. SQLite supports PRIMARY KEY,
UNIQUE, NOT NULL, and CHECK column constraints.

✓ Fifth, specify the table constraints such as PRIMARY KEY, FOREIGN


KEY, UNIQUE, and CHECK constraints.

✓ Finally, optionally use the WITHOUT ROWID option. By default, a row


in a table has an implicit column, which is referred to as the rowid, oid or
_rowid_ column. The rowid column stores a 64-bit signed integer key that
uniquely identifies the row inside the table. If you don’t want SQLite
creates the rowid column, you specify the WITHOUT ROWID option. A
table that contains the rowid column is known as a rowid table. Note that
the WITHOUT ROWID option is only available in SQLite 3.8.2 or later.
✓ Note that the primary key of a table is a column or a group of columns that
uniquely identify each row in the table.
Introduction to SQLite primary key
A primary key is a column or group of columns used to identify the uniqueness
of rows in a table. Each table has one and only one primary key.
SQLite allows you to define primary key in two ways:
First, if the primary key has only one column, you use the PRIMARY KEY
column constraint to define the primary key as follows:
CREATE TABLE table_name(
column_1 INTEGER NOT NULL PRIMARY KEY,
...
);
Second, in case primary key consists of two or more columns, you use the
PRIMARY KEY table constraint to define the primary as shown in the
following statement.
CREATE TABLE table_name(
column_1 INTEGER NOT NULL,
column_2 INTEGER NOT NULL,
...
PRIMARY KEY(column_1,column_2,...)
);
The primary key column must not contain NULL values. It means that the
primary key column has an implicit NOT NULL constraint.
SQLite primary key and rowid table
When you create a table without specifying the WITHOUT ROWID option,
SQLite adds an implicit column called rowid that stores 64-bit signed integer.
The rowid column is a key that uniquely identifies the rows in the table. Tables
that have rowid columns are called rowid tables.
If a table has the primary key that consists of one column, and that column is
defined as INTEGER then this primary key column becomes an alias for the
rowid column.

Notice that if you assign another integer type such as BIGINT and
UNSIGNED INT to the primary key column, this column will not be an alias
for the rowid column.
Because the rowid table organizes its data as a B-tree, querying and sorting
data of a rowid table are very fast. It is faster than using a primary key which
is not an alias of the rowid.
Another important note is that if you declare a column with the INTEGER
type and PRIMARY KEY DESC clause, this column will not become an alias
for the rowid column:
CREATE TABLE table(
Column1 INTEGER PRIMARY KEY DESC,
...
);
SQLite foreign key constraint
SQLite has supported foreign key constraint since version 3.6.19.
To check whether your current version of SQLite supports foreign key
constraints or not, you use the following command.
PRAGMA foreign_keys;
The command returns an integer value: 1: enable, 0: disabled. If the command
returns nothing, it means that your SQLite version doesn’t support foreign key
constraints.
If the SQLite library is compiled with foreign key constraint support, the
application can use the PRAGMA foreign_keys command to enable or disable
foreign key constraints at runtime.

Show tables in a database


To display all the tables in the current database, you use the .tables command.
If you want to find tables based on a specific pattern, you use the .table pattern
command. The sqlite3 uses the LIKE operator for pattern matching.
.tables ‘%s’;

Show the structure of a table


To display the structure of a table, you use the .schema TABLE command.
The TABLE argument could be a pattern. If you omit it, the .schema command
will show the structures of all the tables.
CREATE TABLE table_name(
column_1 INTEGER PRIMARY KEY NOT NULL,
FOREIGN KEY (foreign_key_columns)
REFERENCES parent_table(parent_key_columns)
);

SQLite foreign key constraint actions


To specify how foreign key constraint behaves whenever the parent key is
deleted or updated, you use the ON DELETE or ON UPDATE action as
follows:
CREATE TABLE table_name(
column_1 INTEGER PRIMARY KEY NOT NULL,
column_2 INTEGER NOT NULL,
...
FOREIGN KEY (foreign_key_columns)
REFERENCES parent_table(parent_key_columns)
ON UPDATE action
ON DELETE action;
);
SQLite supports the following actions:
✓ SET NULL
✓ SET DEFAULT
✓ RESTRICT
✓ NO ACTION
✓ CASCADE

SET NULL
When the parent key changes, delete or update, the corresponding child keys of
all rows in the child table set to NULL.
SET DEFAULT
The SET DEFAULT action sets the value of the foreign key to the default value
specified in the column definition when you create the table.
RESTRICT
The RESTRICT action does not allow you to change or delete values in the parent
key of the parent table.
NO ACTION
The NO ACTION does not mean by-pass the foreign key constraint. It has the
similar effect as the RESTRICT.
CASCADE
The CASCADE action propagates the changes from the parent table to the child
table when you update or delete the parent key.
Introduction to SQLite NOT NULL constraint
When you create a table, you can specify whether a column accepts NULL values
or not. By default, all columns in a table accept NULL values except you
explicitly use NOT NULL constraints.
To define a NOT NULL constraint for a column, you use the following syntax:
CREATE TABLE table_name (
...,
column_name type_name NOT NULL,
...
);
Unlike other constraints such as PRIMARY KEY and CHECK, you can only
define NOT NULL constraints at the column level, not the table level.

Based on the SQL standard, PRIMARY KEY should always imply NOT NULL.
However, SQLite allows NULL values in the PRIMARY KEY column except
that a column is INTEGER PRIMARY KEY column or the table is a WITHOUT
ROWID table or the column is defined as a NOT NULL column.

Once a NOT NULL constraint is attached to a column, any attempt to set the
column value to NULL such as inserting or updating will cause a constraint
violation.
Introduction to SQLite UNIQUE constraint
A UNIQUE constraint ensures all values in a column or a group of columns are
distinct from one another or unique.

To define a UNIQUE constraint, you use the UNIQUE keyword followed by one
or more columns.

You can define a UNIQUE constraint at the column or the table level. Only at the
table level, you can define a UNIQUE constraint across multiple columns.
The following shows how to define a UNIQUE constraint for a column at the
column level:
CREATE TABLE table_name(
...,
column_name type UNIQUE,
...
);
The following illustrates how to define a UNIQUE constraint for multiple
columns:
CREATE TABLE table_name(
...,
UNIQUE(column_name1,column_name2,...)
);

SQLite UNIQUE constraint and NULL


SQLite treats all NULL values are different, therefore, a column with a UNIQUE
constraint can have multiple NULL values.

Introduction to SQLite CHECK constraints


SQLite CHECK constraints allow you to define expressions to test values
whenever they are inserted into or updated within a column.

If the values do not meet the criteria defined by the expression, SQLite will issue
a constraint violation and abort the statement.

The CHECK constraints allow you to define additional data integrity checks
beyond UNIQUE or NOT NULL to suit your specific application.

SQLite allows you to define a CHECK constraint at the column level or the table
level.
The following statement shows how to define a CHECK constraint at the column
level:
CREATE TABLE table_name(
...,
column_name data_type CHECK(expression),
...
);

and the following statement illustrates how to define a CHECK constraint at the
table level:
CREATE TABLE table_name(
...,
CHECK(expression)
);
In this syntax, whenever a row is inserted into a table or an existing row is
updated, the expression associated with each CHECK constraint is evaluated and
returned a numeric value 0 or 1.

If the result is zero, then a constraint violation occurred. If the result is a non-zero
value or NULL, it means no constraint violation occurred.

Introduction to SQLite ROWID table


Whenever you create a table without specifying the WITHOUT ROWID option,
you get an implicit auto-increment column called rowid. The rowid column store
64-bit signed integer that uniquely identifies a row in the table.

If you don’t specify the rowid value or you use a NULL value when you insert a
new row, SQLite automatically assigns the next sequential integer, which is one
larger than the largest rowid in the table. The rowid value starts at 1.

The maximum value of therowid column is 9,223,372,036,854,775,807, which


is very big. If your data reaches this maximum value and you attempt to insert a
new row, SQLite will find an unused integer and uses it. If SQLite cannot find
any unused integer, it will issue an SQLITE_FULL error. On top of that, if you
delete some rows and insert a new row, SQLite will try to reuse the rowid values
from the deleted rows.
SQLite AUTOINCREMENT column attribute
SQLite recommends that you should not use AUTOINCREMENT attribute
because:

The AUTOINCREMENT keyword imposes extra CPU, memory, disk space, and
disk I/O overhead and should be avoided if not strictly needed. It is usually not
needed.

In addition, the way SQLite assigns a value for the AUTOINCREMENT column
slightly different from the way it does for the rowid column.
The main purpose of using attribute AUTOINCREMENT is to prevent SQLite to
reuse a value that has not been used or a value from the previously deleted row.

If you don’t have any requirement like this, you should not use the
AUTOINCREMENT attribute in the primary key.

SQLite INSERT – inserting a single row into a table


To insert a single row into a table, you use the following form of the INSERT
statement:

INSERT INTO table (column1,column2 ,..)VALUES( value1, value2


,...);

✓ First, specify the name of the table to which you want to insert data after
the INSERT INTO keywords.
✓ Second, add a comma-separated list of columns after the table name. The
column list is optional. However, it is a good practice to include the column
list after the table name.
✓ Third, add a comma-separated list of values after the VALUES keyword.
If you omit the column list, you have to specify values for all columns in
the value list. The number of values in the value list must be the same as
the number of columns in the column list.
SQLite INSERT – Inserting multiple rows into a table

INSERT INTO table1 (column1,column2 ,..)


VALUES
(value1,value2 ,...),
(value1,value2 ,...),
...
(value1,value2 ,...);
SQLite INSERT – Inserting default values
When you create a new table using the CREATE TABLE statement, you can
specify default values for columns, or a NULL if a default value is not specified.

INSERT INTO TABLES DEFAULT VALUES;

SQLite INSERT – Inserting new rows with data provided by a SELECT


statement
INSERT INTO backup_table
SELECT column1,column2
FROM table;

SQLite Update
To update existing data in a table, you use SQLite UPDATE statement. The
following illustrates the syntax of the UPDATE statement:
UPDATE table
SET column_1 = new_value_1,
column_2 = new_value_2
WHERE
search_condition
ORDER column_or_expression
LIMIT row_count OFFSET offset;

✓ First, specify the ta


✓ ble where you want to update after the UPDATE clause.
✓ Second, set new value for each column of the table in the SET clause.
✓ Third, specify rows to update using a condition in the WHERE clause. The
WHERE clause is optional. If you skip it, the UPDATE statement will
update data in all rows of the table.
✓ Finally, use the ORDER BY and LIMIT clauses in the UPDATE statement
to specify the number of rows to update.

SQLite Delete
The SQLite DELETE statement allows you to delete one row, multiple rows, and
all rows in a table. The syntax of the SQLite DELETE statement is as follows:

DELETE FROM table


WHERE search_condition;
✓ First, specify the name of the table which you want to remove rows after
the DELETE FROM keywords.
✓ Second, add a search condition in the WHERE clause to identify the rows
to remove. The WHERE clause is an optional part of the DELETE
statement. If you omit the WHERE clause, the DELETE statement will
delete all rows in the table.
SQLite Select
The SELECT statement is one of the most commonly used statements in SQL.
The SQLite SELECT statement provides all features of the SELECT statement
in SQL standard.

The syntax of the SELECT statement is as follows:


SELECT DISTINCT column_list
FROM table_list
JOIN table ON join_condition
WHERE row_filter
ORDER BY column
LIMIT count OFFSET offset
GROUP BY column
HAVING group_filter;

simplest form of the SELECT statement that allows you to query data from a
single table.

SELECT column_list FROM table;

Even though the SELECT clause appears before the FROM clause, SQLite
evaluates the FROM clause first and then the SELECT clause, therefore:

First, specify the table where you want to get data from in the FROM clause.
Notice that you can have more than one table in the FROM clause.

Second, specify a column or a list of comma-separated columns in the SELECT


clause.
SELECT * FROM Table;

Introduction to SQLite ORDER BY clause

SQLite stores data in the tables in an unspecified order. It means that the rows in
the table may or may not be in the order that they were inserted.

If you use the SELECT statement to query data from a table, the order of rows
in the result set is unspecified.

To sort the result set, you add the ORDER BY clause to the SELECT statement
as follows:

SELECT
select_list
FROM
table
ORDER BY
column_1 ASC,
column_2 DESC;

The ORDER BY clause comes after the FROM clause. It allows you to sort the
result set based on one or more columns in ascending or descending order.

In this syntax, you place the column name by which you want to sort after the
ORDER BY clause followed by the ASC or DESC keyword.

✓ The ASC keyword means ascending.


✓ And the DESC keyword means descending.
If you don’t specify the ASC or DESC keyword, SQLite sorts the result set using
the ASC option. In other words, it sorts the result set in the ascending order by
default.

In case you want to sort the result set by multiple columns, you use a comma (,)
to separate two columns. The ORDER BY clause sorts rows using columns or
expressions from left to right.

SQLite Select Distinct


The DISTINCT clause is an optional clause of the SELECT statement. The
DISTINCT clause allows you to remove the duplicate rows in the result set.
The following statement illustrates the syntax of the DISTINCT clause:

SELECT DISTINCT select_list FROM table;

✓ The DISTINCT clause must appear immediately after the SELECT


keyword.
✓ You place a column or a list of columns after the DISTINCT keyword. If
you use one column, SQLite uses values in that column to evaluate the
duplicate. In case you use multiple columns, SQLite uses the combination
of values in these columns to evaluate the duplicate.

SQLite considers NULL values as duplicates. If you use the DISTINCT clause
with a column that has NULL values, SQLite will keep one row of a NULL value.

SQLite Where
The WHERE clause is an optional clause of the SELECT statement. It appears
after the FROM clause as the following statement:

SELECT
column_list
FROM
table
WHERE
search_condition;

✓ First, check the table in the FROM clause.


✓ Second, evaluate the conditions in the WHERE clause to get the rows that
met these conditions.
✓ Third, make the final result set based on the rows in the previous step with
columns in the SELECT clause.

The search condition in the WHERE has the following form:

left_expression COMPARISON_OPERATOR right_expression


e.g.
WHERE column_1 = 100;
WHERE column_2 IN (1,2,3);
WHERE column_3 LIKE 'An%';
WHERE column_4 BETWEEN 10 AND 20;

SQLite comparison operators


A comparison operator tests if two expressions are the same.
Operator Meaning
= Equal to
<> or != Not equal to
< Less than
> Greater than
<= Less than or equal to
>= Greater than or equal to

SQLite logical operators


Logical operators allow you to test the truth of some expressions. A logical
operator returns 1, 0, or a NULL value.

Notice that SQLite does not provide Boolean data type therefore 1 means TRUE,
and 0 means FALSE.
Operator Meaning
ALL returns 1 if all expressions are 1.
AND returns 1 if both expressions are 1, and 0 if one of the expressions
is 0.
ANY returns 1 if any one of a set of comparisons is 1.
BETWEEN returns 1 if a value is within a range.
EXISTS returns 1 if a subquery contains any rows.
IN returns 1 if a value is in a list of values.
LIKE returns 1 if a value matches a pattern
NOT reverses the value of other operators such as NOT EXISTS, NOT
IN, NOT BETWEEN, etc.

SQLite Limit
The LIMIT clause is an optional part of the SELECT statement. You use the
LIMIT clause to constrain the number of rows returned by the query.

For example, a SELECT statement may return one million rows. However, if you
just need the first 10 rows in the result set, you can add the LIMIT clause to the
SELECT statement to retrieve 10 rows.
The following illustrates the syntax of the LIMIT clause.
SELECT
column_list
FROM
table
LIMIT row_count;

The row_count is a positive integer that specifies the number of rows returned.

If you want to get the first 10 rows starting from the 10th row of the result set,
you use OFFSET keyword as the following:

SELECT
column_list
FROM
table
LIMIT row_count OFFSET offset;

Or you can use the following shorthand syntax of the LIMIT OFFSET clause:
SELECT
column_list
FROM
table
LIMIT offset, row_count;

Introduction to SQLite HAVING clause


SQLite HAVING clause is an optional clause of the SELECT statement. The
HAVING clause specifies a search condition for a group.

You often use the HAVING clause with the GROUP BY clause. The GROUP
BY clause groups a set of rows into a set of summary rows or groups. Then the
HAVING clause filters groups based on a specified condition.

If you use the HAVING clause, you must include the GROUP BY clause;
otherwise, you will get the following error:

The following illustrates the syntax of the HAVING clause:

SELECT
column_1,
column_2,
aggregate_function (column_3)
FROM
table
GROUP BY
column_1,
column_2
HAVING
search_condition;
In this syntax, the HAVING clause evaluates the search_condition for each group
as a Boolean expression. It only includes a group in the final result set if the
evaluation is true.

Introduction to SQLite UNION operator

Sometimes, you need to combine data from multiple tables into a complete result
set. It may be for tables with similar data within the same database or maybe you
need to combine similar data from multiple databases.

To combine rows from two or more queries into a single result set, you use SQLite
UNION operator. The following illustrates the basic syntax of the UNION
operator:

query_1
UNION [ALL]
query_2
UNION [ALL]
query_3
...;
Both UNION and UNION ALL operators combine rows from result sets into a
single result set. The UNION operator removes eliminate duplicate rows, whereas
the UNION ALL operator does not.

Because the UNION ALL operator does not remove duplicate rows, it runs faster
than the UNION operator.

The following are rules to union data:

✓ The number of columns in all queries must be the same.


✓ The corresponding columns must have compatible data types.
✓ The column names of the first query determine the column names of the
combined result set.
✓ The GROUP BY and HAVING clauses are applied to each individual
query, not the final result set.
✓ The ORDER BY clause is applied to the combined result set, not within
the individual result set.

Introduction to SQLite EXCEPT operator


SQLite EXCEPT operator compares the result sets of two queries and returns
distinct rows from the left query that are not output by the right query.

The following shows the syntax of the EXCEPT operator:


SELECT select_list1
FROM table1
EXCEPT
SELECT select_list2
FROM table2

This query must conform to the following rules:


✓ First, the number of columns in the select lists of both queries must be the
same.
✓ Second, the order of the columns and their types must be comparable.

Introduction to SQLite INTERSECT operator


SQLite INTERSECT operator compares the result sets of two queries and returns
distinct rows that are output by both queries.

The following illustrates the syntax of the INTERSECT operator:


SELECT select_list1
FROM table1
INTERSECT
SELECT select_list2
FROM table2
SQLite INNER JOIN

The basic rules for combining the result sets of two queries are as follows:
✓ First, the number and the order of the columns in all queries must be the
same.
✓ Second, the data types must be comparable.
Introduction to SQLite subquery
A subquery is a SELECT statement nested in another statement. See the following
statement.
SELECT column_1
FROM table_1
WHERE column_1 = (
SELECT column_1
FROM table_2
);

You must use a pair of parentheses to enclose a subquery. Note that you can nest
a subquery inside another subquery with a certain depth.

Typically, a subquery returns a single row as an atomic value, though it may


return multiple rows for comparing values with the IN operator.

You can use a subquery in the SELECT, FROM, WHERE, and JOIN clauses.

Introduction to SQLite EXISTS operator


The EXISTS operator is a logical operator that checks whether a subquery returns
any row.
Here is the basic syntax of the EXISTS operator:
EXISTS(subquery)

In this syntax, the subquery is a SELECT statement that returns zero or more
rows.

If the subquery returns one or more row, the EXISTS operator return true.
Otherwise, the EXISTS operator returns false or NULL.

Note that if the subquery returns one row with NULL, the result of the EXISTS
operator is still true because the result set contains one row with NULL.

To negate the EXISTS operator, you use the NOT EXISTS operator as follows:
NOT EXISTS (subquery)

The NOT EXISTS operator returns true if the subquery returns no row.
SQLite Join
In SQLite, JOIN clause is used to combine records from two or more tables in a
database. It unites fields from two tables by using the common values of the both
table.

SQLite INNER JOIN (Simple Join)


SQLite INNER JOINS return all rows from multiple tables where the join
condition is met
Syntax
SELECT columns FROM table1 INNER JOIN table2
ON table1.column = table2.column;

LEFT JOIN
This type of join returns all rows from the LEFT-hand table specified in the ON
condition and only those rows from the other table where the joined fields are
equal (join condition is met).
Syntax
SELECT columns FROM table1 LEFT [OUTER] JOIN table2
ON table1.column = table2.column;
the LEFT OUTER JOIN keywords are replaced with LEFT JOIN.

SQLite CROSS JOIN


This type of join returns a combined result set with every row from the first table
matched with every row from the second table. This is also called a Cartesian
Product.
Syntax
SELECT columns FROM table1 CROSS JOIN table2;
Unlike an INNER or OUTER join, a CROSS JOIN has no condition to join the 2
tables.

SQLite CASE
The SQLite CASE expression evaluates a list of conditions and returns an
expression based on the result of the evaluation.
The CASE expression is similar to the IF-THEN-ELSE statement in other
programming languages.

You can use the CASE expression in any clause or statement that accepts a valid
expression. For example, you can use the CASE expression in clauses such as
WHERE, ORDER BY, HAVING, SELECT and statements such as SELECT,
UPDATE, and DELETE.

SQLite simple CASE expression Syntax


CASE case_expression
WHEN when_expression_1 THEN result_1
WHEN when_expression_2 THEN result_2
...
[ ELSE result_else ]
END
The simple CASE expression compares the case_expression to the expression
appears in the first WHEN clause, when_expression_1, for equality.
If the case_expression equals when_expression_1, the simple CASE returns the
expression in the corresponding THEN clause, which is the result_1.

Otherwise, the simple CASE expression compares the case_expression with the
expression in the next WHEN clause.

In case no case_expression matches the when_expression, the CASE expression


returns the result_else in the ELSE clause. If you omit the ELSE clause, the CASE
expression returns NULL.

Introduction to the SQLite GLOB operator


The GLOB operator is similar to the LIKE operator. The GLOB operator
determines whether a string matches a specific pattern.

Unlike the LIKE operator, the GLOB operator is case sensitive and uses the
UNIX wildcards. In addition, the GLOB patterns do not have escape characters.

The following shows the wildcards used with the GLOB operator:
✓ The asterisk (*) wildcard matches any number of characters.
✓ The question mark (?) wildcard matches exactly one character.

On top of these wildcards, you can use the list wildcard [] to match one
character from a list of characters. For example [xyz] match any single x, y, or
z character.

The list wildcard also allows a range of characters e.g., [a-z] matches any single
lowercase character from a to z. The [a-zA-Z0-9] pattern matches any single
alphanumeric character, both lowercase, and uppercase.

Besides, you can use the character ^ at the beginning of the list to match any
character except for any character in the list. For example, the [^0-9] pattern
matches any single character except a numeric character.
Syntax
SELECT Columns FROM Table WHERE Column GLOB 'pattern';
SQLite Transaction
SQLite & ACID
SQLite is a transactional database that all changes and queries are atomic,
consistent, isolated, and durable (ACID).

SQLite guarantees all the transactions are ACID compliant even if the transaction
is interrupted by a program crash, operation system dump, or power failure to the
computer.
Atomic: a transaction should be atomic. It means that a change cannot be broken
down into smaller ones. When you commit a transaction, either the entire
transaction is applied or not.
Consistent: a transaction must ensure to change the database from one valid state
to another. When a transaction starts and executes a statement to modify data, the
database becomes inconsistent. However, when the transaction is committed or
rolled back, it is important that the transaction must keep the database consistent.
Isolation: a pending transaction performed by a session must be isolated from
other sessions. When a session starts a transaction and executes the INSERT or
UPDATE statement to change the data, these changes are only visible to the
current session, not others. On the other hand, the changes committed by other
sessions after the transaction started should not be visible to the current session.
Durable: if a transaction is successfully committed, the changes must be
permanent in the database regardless of the condition such as power failure or
program crash. On the conflicting, if the program crashes before the transaction
is committed, the change should not persist.

SQLite transaction statements


By default, SQLite operates in auto-commit mode. It means that for each
command, SQLite starts, processes, and commits the transaction automatically.

To start a transaction explicitly, you use the following steps:

First, open a transaction by issuing the BEGIN TRANSACTION command.


BEGIN TRANSACTION;

After executing the statement BEGIN TRANSACTION, the transaction is open


until it is explicitly committed or rolled back.

Second, issue SQL statements to select or update data in the database. Note that
the change is only visible to the current session (or client).
Third, commit the changes to the database by using the COMMIT or COMMIT
TRANSACTION statement.
COMMIT;

If you do not want to save the changes, you can roll back using the ROLLBACK
or ROLLBACK TRANSACTION statement:
ROLLBACK;

SQLite Trigger
An SQLite trigger is a named database object that is executed automatically when
an INSERT, UPDATE or DELETE statement is issued against the associated
table.

SQLite CREATE TRIGGER statement


To create a new trigger in SQLite, you use the CREATE TRIGGER statement as
follows:
CREATE TRIGGER [IF NOT EXISTS] trigger_name
[BEFORE|AFTER|INSTEAD OF] [INSERT|UPDATE|DELETE]
ON table_name
[WHEN condition]
BEGIN
statements;
END;
in this syntax:

✓ First, specify the name of the trigger after the CREATE TRIGGER keywords.
✓ Next, determine when the trigger is fired such as BEFORE, AFTER, or
INSTEAD OF. You can create BEFORE and AFTER triggers on a table.
However, you can only create an INSTEAD OF trigger on a view.
✓ Then, specify the event that causes the trigger to be invoked such as INSERT,
UPDATE, or DELETE.
✓ After that, indicate the table to which the trigger belongs.
✓ Finally, place the trigger logic in the BEGIN END block, which can be any
valid SQL statements.
If you combine the time when the trigger is fired and the event that causes the
trigger to be fired, you have a total of 9 possibilities:

✓ BEFORE INSERT
✓ AFTER INSERT
✓ BEFORE UPDATE
✓ AFTER UPDATE
✓ BEFORE DELETE
✓ AFTER DELETE
✓ INSTEAD OF INSERT
✓ INSTEAD OF DELETE
✓ INSTEAD OF UPDATE

Suppose you use a UPDATE statement to update 10 rows in a table, the trigger
that associated with the table is fired 10 times. This trigger is called FOR EACH
ROW trigger. If the trigger associated with the table is fired one time, we call this
trigger a FOR EACH STATEMENT trigger.

As of version 3.9.2, SQLite only supports FOR EACH ROW triggers. It has not
yet supported the FOR EACH STATEMENT triggers.

If you use a condition in the WHEN clause, the trigger is only invoked when the
condition is true. In case you omit the WHEN clause, the trigger is executed for
all rows.
Notice that if you drop a table, all associated triggers are also deleted.

You can access the data of the row being inserted, deleted, or updated using the
OLD and NEW references in the form: OLD.column_name and
NEW.column_name.

the OLD and NEW references are available depending on the event that causes
the trigger to be fired.

The following table illustrates the rules


Action Reference
INSERT NEW is available
UPDATE Both NEW and OLD are available
DELETE OLD is available
SQLite DROP TRIGGER statement

DROP TRIGGER [IF EXISTS] trigger_name;

In this syntax:
✓ First, specify the name of the trigger that you want to drop after the DROP
TRIGGER keywords.
✓ Second, use the IF EXISTS option to delete the trigger only if it exists.
Note that if you drop a table, SQLite will automatically drop all triggers
associated with the table.

SQLite Date & Time


SQLite does not support built-in date and/or time storage class. Instead, it
leverages some built-in date and time functions to use other storage classes such
as TEXT, REAL, or INTEGER for storing the date and time values.

Using the TEXT storage class for storing SQLite date and time
If you use the TEXT storage class to store date and time value, you need to use
the ISO8601 string format as follows:
YYYY-MM-DD HH:MM:SS.SSS

To insert date and time values into the table, you use the DATETIME function.

For example, to get the current UTC date and time value, you pass the now literal
string to the function as follows:

SELECT datetime('now');

To get the local time, you pass an additional argument localtime.


SELECT datetime('now','localtime');

Using REAL storage class to store SQLite date and time values
You can use the REAL storage class to store the date and/ or time values as Julian
day numbers, which is the number of days since noon in Greenwich on November
24, 4714 B.C. based on the proleptic Gregorian calendar.

We used the julianday() function to convert the current date and time to the
Julian Day.
julianday('now')
Using INTEGER to store SQLite date and time values
Besides TEXT and REAL storage classes, you can use the INTEGER storage
class to store date and time values.

We typically use the INTEGER to store UNIX time which is the number of
seconds since 1970-01-01 00:00:00 UTC

strftime('%s','now')
Unit-2: Database backup and CSV handling:
Dump the entire database into a file using the SQLite dump command

To dump a database into a file, you use the .dump command. The .dump
command converts the entire structure and data of an SQLite database into a
single text file.
sqlite3 file-path
.dump
.exit

By default, the .dump command outputs the SQL statements on screen.


To issue the output to a file, you use the .output FILENAME command.
e.g.
.output file-path
.dump
.exit

Dump a specific table using the SQLite dump command

To dump a specific table, you specify the table name after the .dump command.
.output file-path
.dump tablename
.exit

Dump tables structure only using schema command


To dump the table structures in a database, you use the .schema command.
.output file-path
.schema
.exit

Dump data of one or more tables into a file


To dump the data of a table into a text file, you use these steps:
First, set the mode to insert using the .mode command as follows:
.mode insert
Second, set the output to a text file instead of the default standard output.
The following command sets the output file to the data.sql file.
.output data.sql
From now on, every SELECT statement will issue the result as the INSERT
statements instead of pure text data.
What is CSV?
CSV stands for comma separated values. It is not a file type. CSV, in fact, is a
format. It refers to the way the data is structured (or formatted), not the type of
file or program used to open it. A CSV file is formatted so that every piece of
information in the file is separated with a comma. When a program reads a CSV
file, it knows that every time it sees a comma, the next piece of information should
be treated separately from the preceding information.

why do we use it?


Moving Data
The single biggest use for CSV is to move data between two (or more!) places.
This means importing and exporting.
CSV format is the simplest and most universally accepted data format. No fancy
formulas, no complex formatting, no proprietary programmer language - just
simple plain text and commas.

The simplicity of CSV means that it is universally readable by any program you
use - such as Excel, Numbers, your CRM or other databases. So, regardless of
where you export or import data from, CSV will work.

An added benefit of CSV's simplicity, is that it is generally light-years faster to


process than other file types. Reading CSV files does not require complex
programming, so whether you are performing an export or import, you will see
the process working more efficiently with CSV formatted files.

Another benefit, (not to be discounted!) is the fact that CSV is data stored exactly
"as-is". That means that what data you have saved will never be wrongly
formatted or translated, regardless of where you open it or import it into.

Backing up Data
Data backups are an essential business process, and so it is important to ensure
your data is being saved in a format that is both readable and accessible to you.
CSV, being the universal language, will ensure that the data you backup is able
to be opened and read, and even imported back into your database.

Does CSV have rows limit?


CSV is just a data format, it is not the file or program itself. So, you can open up
a CSV in TextEditor, Excel, Numbers, GoogleSheets, and hundreds of other
options. And you guessed it, each one of those options has it's own limitations
and rules about the files it can open.
To summarize, you can think of CSV as the most portable of the file formats. It
takes the least amount of time to process, they take up less space when
downloaded, and you can take them virtually anywhere. This is why most apps
and software will always opt to use CSV as their format of choice when working
with data coming in or out.

Import a CSV File into an SQLite Table


To import the csv file into the SQlite table:
First, set the mode to CSV to instruct the command-line shell program to interpret
the input file as a CSV file. To do this, you use the .mode command as follows:
sqlite> .mode csv
Second, use the command .import FILE TABLE to import the data from the csv
file into the table.
sqlite>.import File-Path Table

Export SQLite Database to a CSV File


To export data from the SQLite database to a CSV file, you use these steps:
✓ Turn on the header of the result set using the .header on command.
✓ Set the output mode to CSV to instruct the sqlite3 tool to issue the result in
the CSV mode.
✓ Send the output to a CSV file.
✓ Issue the query to select data from the table to which you want to export.

e.g.
sqlite> .header on
sqlite> .mode csv
sqlite> .output Employee.csv
sqlite> SELECT * FROM TABLE;
sqlite> .quit
✓ To insert table column names in CSV or Excel file we used .header on
command.
✓ To return the data in CSV format we used .mode CSV.
✓ To send data to CSV file we used .output command and SELECT statement
to export data from the required table.
Import a CSV File into an SQLite Table
If you want to import data from CSV file into a table that does not exist in the
SQLite database.
To import the csv file into the table:
First, set the mode to CSV to instruct the command-line shell program to
interpret the input file as a CSV file.
To do this, you use the .mode command as follows:
sqlite> .mode csv

Second, use the command .import FILE TABLE to import the data from the
city.csv file into the cities table.

sqlite>.import c:/sqlite/city.csv cities

Import a CSV file into a table using SQLite Studio


Most SQLite GUI tools provide the import function that allows you to import
data from a file in CSV format, tab-delimited format, etc., into a table.

We will use the SQLite Studio to show you how to import a CSV file into a
table with the assumption that the target table already exists in the database.
Unit-3: Python interaction with SQLite

Python Module
A module is a file containing Python definitions and statements. A module can
define functions, classes and variables. A module can also include runnable code.
Grouping related code into a module makes the code easier to understand and use.
Simply, a module is a file consisting of Python code.
e.g.
The Python code for a module named “myModule” normally resides in a file
named “myModule.py”.
e.g.
def print_func( par ):
print ("Hello : ", par)
return
save this code in myModule.py file

import statement
You can use any Python source file as a module by executing an import statement
in some other Python source file.
Syntax:
import module1[, module2[,... moduleN]
When the interpreter encounters an import statement, it imports the module if the
module is present in the search path. A search path is a list of directories that the
interpreter searches before importing a module.
e.g.
import myModule
myModule.fuc(“SYBCA”)

The from….import statement


Python's from….import statement lets you import specific attributes from a
module into the current namespace.
Syntax:
from module import name1[,name2,name3,….N]
e.g.
form myModule import fuc
This statement does not import the entire module “myModule” into the current
namespace; it just introduces the item “fuc” from the module “myModule”into
the global symbol table of the importing module.
The from….import *statement
It is also possible to import all names from a module into the current namespace
by using the following import statement

from module import *


This provides an easy way to import all the items from a module into the current
namespace

When you import a module, the Python interpreter searches for the module in
the following sequences −
1. The current directory.
2. If the module isn't found, Python then searches each directory in the shell
variable PYTHONPATH.
3. If all else fails, Python checks the default path.

The module search path is stored in the system module sys as the sys.path
variable. The sys.path variable contains the current directory, PYTHONPATH,
and the installation-dependent default.

The PYTHONPATH Variable


The PYTHONPATH is an environment variable, consisting of a list of
directories. The syntax of PYTHONPATH is the same as that of the shell variable
PATH.

Namespaces and Scoping


Variables are names (identifiers) that map to objects. A namespace is a dictionary
of variable names (keys) and their corresponding objects (values).
A Python statement can access variables in a local namespace and in the global
namespace. If a local and a global variable have the same name, the local variable
shadows the global variable.

Each function has its own local namespace. Class methods follow the same
scoping rule as ordinary functions.
Python makes educated guesses on whether variables are local or global. It
assumes that any variable assigned a value in a function is local.
Therefore, in order to assign a value to a global variable within a function, you
must first use the global statement.
The statement global VarName tells Python that VarName is a global variable.
Python stops searching the local namespace for the variable.
e.g.
Money = 2000
def AddMoney():
# Uncomment the following line to fix the code:
# global Money
Money = Money + 1
print Money
AddMoney()
print (Money)

The dir( ) Function


The dir() built-in function returns a sorted list of strings containing the names
defined by a module.
The list contains the names of all the modules, variables and functions that are
defined in a module.
e.g.
import math
content = dir(math)
print (content)

Python Package
Python has packages for directories and modules for files. As our application
program grows larger in size with a lot of modules, we place similar modules in
one package and different modules in different packages. This makes a project
(program) easy to manage and conceptually clear.

A Python module may contain several classes, functions, variables, etc. whereas
a Python package can contain several modules. In simpler terms a package is
folder that contains various modules as files.

Creating Package
✓ Create a folder named with packagename.
✓ Inside this folder create an empty Python file i.e. __init__.py
✓ Then create two modules in that folder.

__init__.py
__init__.py helps the Python interpreter to recognise the folder as package. It also
specifies the resources to be imported from the modules. If the __init__.py is
empty this means that all the functions of the modules will be imported.
We can also specify the functions from each module to be made available.
Syntax
from .modulename import function,class,variable

Import Modules from a Package


We can import these modules using the from…import statement and the dot(.)
operator.

Syntax
import package_name.module_name

Python Exception Handling


Python has many built-in exceptions that are raised when your program
encounters an error (something in the program goes wrong).
When these exceptions occur, the Python interpreter stops the current process
and passes it to the calling process until it is handled. If not handled, the
program will crash.
Handling Exception in Python
In Python, exceptions can be handled using a try statement.
The critical operation which can raise an exception is placed inside the try
clause. The code that handles the exceptions is written in the except clause. We
can thus choose what operations to perform once we have caught the exception.
Syntax
try:
# statement(s)
except Exception as e:
# statement(s)

Catching Specific Exceptions in Python


A try statement can have more than one except clause, to specify handlers for
different exceptions. Please note that at most one handler will be executed.

try:
# do something
pass
except ValueError:
# handle ValueError exception
pass
except (TypeError, ZeroDivisionError):
# handle multiple exceptions
# TypeError and ZeroDivisionError
pass
except:
# handle all other exceptions
pass

Python try with else clause


In some situations, you might want to run a certain block of code if the code block
inside try ran without any errors. For these cases, you can use the optional else
keyword with the try statement.
e.g
def div(a , b):
try:
c = (a / b)
except ZeroDivisionError:
print ("a/b result in 0")
else:
print (c)
finally Keyword in Python
Python provides a keyword finally, which is always executed after the try and
except blocks. The final block always executes after normal termination of try
block or after try block terminates due to some exception.

Syntax:
try:
# Some Code....
except:
# optional block
# Handling of exception (if required)
else:
# execute if no exception
finally:
# Some code .....(always executed)
Raising Exception
In Python programming, exceptions are raised when errors occur at runtime. We
can also manually raise exceptions using the raise keyword. This must be either
an exception instance or an exception class (a class that derives from Exception).

try:
raise NameError("Hi there") # Raise Error
except NameError:
print ("An exception")
raise # To determine whether the exception was raised or not
sqlite3 Module in Python
SQLite is a C library that provides a lightweight disk-based database that doesn’t
require a separate server process and allows accessing the database using a
nonstandard variant of the SQL query language.
The sqlite3 module was written by Gerhard Häring.

Connecting to the Database

Connection to the SQLite Database can be established using the connect()


method, passing the name of the database to be accessed as a parameter. If that
database does not exist, then it’ll be created.

import sqlite3
con = sqlite3.connect('databasefile')

Once a Connection has been established, create a Cursor object and call its
execute() method to perform SQL commands:

Syntax:
execute(sql, parameters=(), /)
Execute an SQL statement. Values may be bound to the statement using
placeholders.

execute() will only execute a single SQL statement. If you try to execute more
than one statement with it, it will raise a Warning. Use executescript() if you
want to execute multiple SQL statements with one call.

To insert a variable into a query string, use a placeholder in the string, and
substitute the actual values into the query by providing them as a tuple of values
to the second argument of the cursor’s execute() method. An SQL statement may
use one of two kinds of placeholders: question marks (qmark style) or named
placeholders (named style).
For the qmark style, parameters must be a sequence.
For the named style, it can be either a sequence or dict instance.

# This is the qmark style:


cur.execute("sql statement (?, ?)", ("value1", value2))
# And this is the named style:
cur.execute("sql statement=:parameter", {" parameter ": value})

executemany(sql, seq_of_parameters, /)
Execute a parameterized SQL command against all parameter sequences or
mappings found in the sequence seq_of_parameters. It is also possible to use an
iterator yielding parameters instead of a sequence.

# The qmark style used with executemany():


lang_list = [ ("value1", value2),("value1", value2),("value1", value2)]
cur.executemany("sql statement (?, ?)", lang_list)

# Create Cursor Object


cur = con.cursor()

#use execute method to execute query on database


cur.execute(''Sqlite Query'')

# Save (commit) the changes


con.commit()

# We can also close the connection if we are done with it.


con.close()

executescript(sql_script, /)
Execute multiple SQL statements at once. If there is a pending transaciton, an
implicit COMMIT statement is executed first. No other implicit transaction
control is performed; any transaction control must be added to sql_script.
sql_script must be a string.

fetchone()
The fetchone() returns the next row of a query result set, returning a single tuple,
or None when no more data is available.
e.g.
import sqlite3
con = sqlite3.connect('ydb.db')
with con:
cur = con.cursor()
cur.execute("SELECT * FROM cars")
while True:
row = cur.fetchone()
if row == None:
break
print(f"{row[0]} {row[1]} {row[2]}")

Python SQLite dictionary cursor


The default cursor returns the data in a tuple of tuples. When we use a dictionary
cursor, the data is sent in the form of Python dictionaries. This way we can refer
to the data by their column names.
e.g.
import sqlite3
con = sqlite3.connect('database.db')
cur = conn.cursor()
cur.execute("SELECT * FROM tasks")
rows = cur.fetchall()
for row in rows:
print(row)

File Handling in Python


Files
Files are named locations on disk to store related information. They are used to
permanently store data in a non-volatile memory (e.g. hard disk).
Since Random Access Memory (RAM) is volatile (which loses its data when the
computer is turned off), we use files for future use of the data by permanently
storing them.

Python treats file differently as text or binary and this is important. Each line of
code includes a sequence of characters and they form text file. Each line of a file
is terminated with a special character, called the EOL or End of Line characters
like comma {,} or newline character. It ends the current line and tells the
interpreter a new one has begun.

When we want to read from or write to a file, we need to open it first. When we
are done, it needs to be closed so that the resources that are tied with the file are
freed.
Hence, in Python, a file operation takes place in the following order:
1. Open a file
2. Read or write (perform operation)
3. Close the file
Opening Files in Python
Python has a built-in open() function to open a file. This function returns a file
object, also called a handle, as it is used to read or modify the file accordingly.

syntax : open(filename, mode)

There are four different methods (modes) for opening a file:


Mode Description
r Opens a file for reading. (default)
w Opens a file for writing. Creates a new file if it does not exist or
truncates the file if it exists.
x Opens a file for exclusive creation. If the file already exists, the
operation fails.
a Opens a file for appending at the end of the file without truncating it.
Creates a new file if it does not exist.
+ Opens a file for updating (reading and writing)

In addition, you can specify if the file should be handled as binary or text mode
"t" - Text - Default value. Text mode
"b" - Binary - Binary mode (e.g. images)
e.g.
f = open("test.txt") # equivalent to 'r' or 'rt'
f = open("test.txt",'w') # write in text mode
f = open("img.bmp",'r+b') # read and write in binary mode

Closing Files in Python


When we are done with performing operations on the file, we need to properly
close the file.
Closing a file will free up the resources that were tied with the file. It is done
using the close() method available in Python.
e.g.
f = open("test.txt", encoding = 'utf-8')
# perform file operations
f.close()
Working of read() mode
There is more than one way to read a file in Python.
read() method
If you need to extract a string that contains all characters in the file then we can
use file.read().
e.g.
file = open("file.text", "r")
print file.read()
Another way to read a file is to call a certain number of characters like in the
following code the interpreter will read the first five characters of stored data and
return it as a string:
e.g.
file = open("file.txt", "r")
print file.read(5)

tell()
Returns an integer that represents the current position of the file's object.
seek()
Changes the file position to offset bytes, in reference to from (start, current, end).
Syntax:
Syntax: fseek(offset, from_what)
Parameters:
Offset: Number of positions to move forward
from_what: It defines point of reference.
Returns: Return the new absolute position.
The reference point is selected by the from_what argument.
It accepts three values:
0: sets the reference point at the beginning of the file
1: sets the reference point at the current file position
2: sets the reference point at the end of the file

By looping through the lines of the file, you can read the whole file, line by line:
f = open("demofile.txt", "r")
for x in f:
print(x)

readline()
The readlines() method returns a list of remaining lines of the entire file. All these
reading methods return empty values when the end of file (EOF) is reached.
write()
Writes the string s to the file and returns the number of characters written.
Syntax: write(string)

writelines()
Writes a list of lines to the file.
Syntax: writelines(lines)

Delete a File
To delete a file, you must import the OS module, and run
its os.remove() function:
e.g.
Remove the file "demofile.txt":
import os
os.remove("demofile.txt")

Code to Check if File exist


import os
if os.path.exists("demofile.txt"):
os.remove("demofile.txt")
else:
print("The file does not exist")

CSV File Handling in Python


A CSV (Comma Separated Values) format is one of the most simple and common
ways to store tabular data. To represent a CSV file, it must be saved with the .csv
file extension.
There is an inbuilt module called csv. The csv module implements classes to read
and write tabular data in CSV format. It allows programmers to say, “write this
data in the format preferred by Excel,” or “read data from this file which was
generated by Excel,”

How to Read Data From CSV File?


Step 1 - import the csv library
Step 2 - Open the csv file using open() function
Use the csv.reader object to read the CSV file
e.g.
import csv
fobj = open(‘File path’)
csvreader = csv.reader(fobj)
the csv.reader() is used to read the file, which returns an iterable reader object.
function in default mode for CSV files having comma delimiter. Suppose our
CSV file was using tab as a delimiter. To read such files, we can pass optional
parameters to the csv.reader() function.
e.g.
csv.reader(file,delimiter=’\t’)

Extract the field names


Create an empty list called header.
Use the next() method to obtain the header.
The .next() method returns the current row and moves to the next row.
The first time you run next() it returns the header and the next time you run it
returns the first record and so on.
e.g.
header = []
header = next(csvreader)
header
Extract the rows/records
Create an empty list called rows and iterate through the csvreader object and
append each row to the rows list.
rows = []
for row in csvreader:
rows.append(row)

How to Write Data From CSV File?


Let’s assume we are recording 3 Contacts Data
Header = [‘con_name’, ‘con_num’, ‘con_email’]
data = [
[‘Pritesh’,’9925566582’,’[email protected]’],
[‘Anita’,’1243568798’,’[email protected]’],
[‘Hiya’,’1212121212’,’[email protected]’]
]
Step 1 - import the csv library
Step 2 - Open the csv file using open() function
Step 3 - Create a csvwriter object using csv.writer()
Step 4 Write the Header
Step 5 Write data
e.g.
import csv
fobj = open(‘File path’,’w’,newline=’’)
csvwriter=csv.writer(fobj) # create a csvwriter object
csvwriter.writerow(header) # to write a single header row to file
csvwriter.writerows(data) # to write multiple rows in file

DictReader and DictWriter Classes


The DictReader and DictWriter are classes available in Python for reading and
writing to CSV. Although they are similar to the reader and writer functions, these
classes use dictionary objects to read and write to csv files.

DictReader
It creates an object which maps the information read into a dictionary whose keys
are given by the fieldnames parameter. This parameter is optional, but when not
specified in the file, the first row data becomes the keys of the dictionary.
e.g.
import csv
with open('name.csv') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
print(row['first_name'], row['last_name'])
DictWriter
This class is similar to the DictWriter class and does the opposite, which is writing
data to a CSV file. The class is defined as
csv.DictWriter(csvfile, fieldnames, restval='', extrasaction='raise', dialect='excel',
*args, **kwds)

The fieldnames parameter defines the sequence of keys that identify the order in
which values in the dictionary are written to the CSV file.
Unlike the DictReader, this key is not optional and must be defined in order to
avoid errors when writing to a CSV.
e.g.
import csv
csvfile = open('names.csv', 'w', newline='')
fieldnames = ['first_name', 'last_name']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerow({'first_name': 'Baked', 'last_name': 'Beans'})
writer.writerow({'first_name': 'Lovely', 'last_name': 'Spam'})
writer.writerow({'first_name': 'Wonderful', 'last_name': 'Spam'})
The Pandas DataFrame
The Pandas DataFrame is a structure that contains two-dimensional data and its
corresponding labels. DataFrames are widely used in data science, machine
learning, scientific computing, and many other data-intensive fields.
DataFrames are similar to SQL tables or the spreadsheets that you work with in
Excel or Calc.

Pandas DataFrames are data structures that contain:


✓ Data organized in two dimensions, rows and columns
✓ Labels that correspond to the rows and columns

In the real world, a Pandas DataFrame will be created by loading the datasets
from existing storage, storage can be SQL Database, CSV file, and Excel file.
Pandas DataFrame can be created from the lists, dictionary, and from a list of
dictionary etc.

Creating a dataframe using List:


DataFrame can be created using a single list or a list of lists.
# import pandas as pd
import pandas as pd

# list of strings
lst = ['kbs', 'College', 'Vapi']
# Calling DataFrame constructor on list
df = pd.DataFrame(lst)
print(df)
Dataframe using list with index and column names
# import pandas as pd
import pandas as pd

# list of strings
lst = ['Valsad', 'Surat', 'Vapi']

# Calling DataFrame constructor on list


# with indices and columns specified
df = pd.DataFrame(lst, index =['a', 'b', 'c'],columns =['City'])
print(df)

Using zip() for zipping two lists


# import pandas as pd
import pandas as pd

# list of strings
lst = ['Pritesh', 'Nimesh', 'Mitesh']

# list of int
lst2 = [101, 102, 103]

# Calling DataFrame constructor after zipping


# both lists, with columns specified
df = pd.DataFrame(list(zip(lst2, lst)),columns =[‘Eno’,'Name'])
print(df)

Creating DataFrame using multi-dimensional list


# import pandas as pd
import pandas as pd
# List
lst = [['tom', 25], ['krish', 30],['nick', 26], ['juli', 22]]
df = pd.DataFrame(lst, columns =['Name', 'Age'])
print(df)
Using lists in dictionary to create dataframe
# importing pandas as pd
import pandas as pd

# list of name, degree, score


nme = ["aparna", "pankaj", "sudhir", "Gayu"]
deg = ["MBA", "BCA", "M.Tech", "MBA"]
scr = [90, 40, 80, 98]

# dictionary of lists
dict = {'name': nme, 'degree': deg, 'score': scr}
df = pd.DataFrame(dict)
print(df)

head() and tail() function


Pandas DataFrames can sometimes be very large, making it impractical to look
at all the rows at once. You can use .head() to show the first few items and .tail()
to show the last few items:
e.g.
df.head(n=2) it returns first two rows
df.tail(n=2) it returns last two rows

Access Column from dataframe


You can access a column in a Pandas DataFrame the same way you would get a
value from a dictionary:
e.g.
df[‘column-name’]
If the name of the column is a string that is a valid Python identifier, then you can
use dot notation to access it. That is, you can access the column the same way
you would get the attribute of a class instance:

Each column of a Pandas DataFrame is an instance of pandas.Series, a structure


that holds one-dimensional data and their labels. You can get a single item of a
Series object the same way you would with a dictionary, by using its label as a
key:
e.g.
col = df[‘column-name’]
col[‘row-lable’]
You can also access a whole row with the accessor .loc[]:
e.g.
df.loc[‘row-lable’]

Dialects and Formatting


A dialect is a helper class used to define the parameters for a specific reader or
writer instance. Dialects and formatting parameters need to be declared when
performing a reader or writer function.

There are several attributes which are supported by a dialect:


delimiter: A string used to separate fields. It defaults to ','.
double quote: Controls how instances of quotechar appearing inside a field
should be quoted. Can be True or False.
escapechar: A string used by the writer to escape the delimiter if quoting is set
to QUOTE_NONE.
lineterminator: A string used to terminate lines produced by the writer. It
defaults to '\r\n'.
quotechar: A string used to quote fields containing special characters. It defaults
to '"'.
skipinitialspace: If set to True, any white space immediately following the
delimiter is ignored.
strict: If set to True, it raises an exception Error on bad CSV input.
quoting: Controls when quotes should be generated when reading or writing to a
CSV.

You might also like