Visualizing Data in Python
Objectives
In this session, you will learn to:
Implement Programming with multiple Tables.
Uses of Keys.
Implement Joins to Retrieve Data.
Visualize Data.
Build a Google Map from Geocoding.
Visualize Networks and Interconnections.
Visualize Mail Data.
Implement
Programming with
Multiple Tables
The basic requirements to be considered for programming in
tables are
Use unique id while retrieving, updating and deleting the data from
Tables.
Follow all the constraints rules while inserting and updating the data.
Implement
Programming with
Multiple Tables
(Contd.)
Constraints in Database Tables
cur.execute('''CREATE TABLE IF NOT EXISTS People
(id INTEGER PRIMARY KEY, name TEXT UNIQUE, retrieved INTEGER)''')
cur.execute('''CREATE TABLE IF NOT EXISTS Follows
(from_id INTEGER, to_id INTEGER, UNIQUE(from_id, to_id))''')
Observations
“name” column in the “People” Table must be UNIQUE.
Combination of the two numbers in each row of the “Follows” Table
must be Unique.
cur.execute('''INSERT OR IGNORE INTO People (name, retrieved)
VALUES ( ?, 0)''', ( friend, ) )
“OR IGNORE” clause should be added to the INSERT statement to indicate that
the INSERT statement should the violation of rule “name must be unique”.
Implement
Programming with
Multiple Tables
(Contd.)
Retrieve and/or insert a record
friend = u['screen_name']
cur.execute('SELECT id FROM People WHERE name
= ? LIMIT 1',(friend, ) )
try: except – Indicates the
friend_id = cur.fetchone()[0] instruction to be carried out in
countold = countold + 1 case of an exception. In this
except: case row was not found hence
cur.execute('''INSERT OR IGNORE INTO People should be inserted.
(name, retrieved)VALUES ( ?, 0)''', ( friend, ) ) INSERT OR IGNORE – Indicates
conn.commit() errors to be avoided while
if cur.rowcount != 1 : inserting data.
print 'Error inserting account:',friend cur.rowcount - Indicates how
continue many rows are affected.
friend_id = cur.lastrowid
countnew = countnew + 1
Use Different Types of Keys
Logical Key - is a key that the “real world”
might use to look up a row.
Primary Key - is usually a number that is
assigned automatically by the database.
Foreign Key - is usually a number that
points to the primary key of an associated
row in a different table.
Use JOIN to Retrieve Data
JOIN clause – used to combine data from different tables.
SELECT * FROM Follows JOIN People
ON Follows.from_id = People.id WHERE People.id = 1
JOIN clause indicates that the fields selected are from the
Follows and People Tables.
The ON clause indicates how the two tables are to be joined.
Data Visualization
Data Visualization refers to the techniques used to
communicate data or information by encoding it as visual
objects.
(e.g., points, lines or bars)
Google Geocoding
Geocoding is the process of converting addresses (like "1600
Amphitheatre Parkway, Mountain View, CA") into geographic
coordinates (like latitude 37.423021 and longitude -
122.083739), which can be used to place markers on a map,
or position the map.
Google Geocoding
(Contd.)
To start using Geocoding API, Create or Select a Project in the
Google Developers Console and enable the API.
https://console.developers.google.com/flows/enableapi?
apiid=geocoding_backend&keyType=SERVER_SIDE
Google Developers
Console
Google Developers
Console
Visualizing Networks
and Interconnections
What is Visualizing
networks and
interconnections?
Visualizing Networks and
Interconnections (Contd.)
The first program (spider.py) program crawls a web site and
pulls a series of pages into the database (spider.sqlite),
recording the links between pages. You can restart the
process at any time by removing the spider.sqlite file and re-
running spider.py.
Visualizing Mail Data
Sometimes you have to pull down mail
data from servers and that might take
quite some time and the data might
be inconsistent, error filled and need a
lot of clean up or adjustment.
We work with an application that is
the most complex and pull down
nearly a gigabyte of data and visualize
it.
Just a Minute
__________Key is used to uniquely
identify records in a table.
Just a Minute
__________Key is used to uniquely
identify records in a table.
Answer: Primary
Activity
Activity : Programming With Multiple Tables
Problem Statement:
Write a simple program to retrieve data from multiple tables.
Prerequisite: For this activity please refer school.db available in
“Data_File_For_Students” folder.
Summary
In this session , you have learned that :
Data Visualization refers to the techniques used to communicate
data.
Geocoding is the process of converting addresses.
Geocoding API is used for Visualizing Networks and
Interconnections.