5th Module Notes
5th Module Notes
What is Socket? Explain how socket connection can be established over TCP/IP connection and
retrieve the data from a web page.
Socket is bidirectional data path to a remote system. A socket is much like a file, except that a
single socket provides a two-way connection between two programs.
Socket programming is a way of connecting two nodes on a network to communicate with each
other.
Syntax:
Where,
Urllib package is the URL handling module for python. It is used to fetch URLs (Uniform
Resource Locators). It uses the urlopen function and is able to fetch URLs using a variety of
different protocols. Urllib is a package that collects several modules for working with URLs,
such as: urllib.
Write a Program to retrieve data from a webpage using urllib and to count the number of words
in it.
What is Beautiful Soup? Write program to illustrate Parsing HTML using Beautiful Soup
Web scrapping is a data scraping used for extracting data from websites. The web scraping
software is a Beautiful Soup may directly access the World Wide Web (WWW) using HTTP or a
web browser.
“The HTML content of the webpages can be parsed and scraped with
Beautiful Soup”
Parsing is defined as the processing of a piece of python program and converting these
codes into machine language.
Output
Beautiful Soup to pull out various parts of each tag as follows
Using Web Services
Write a note on XML. Design python program to retrieve a node present in XML tree
The findall method retrieves a Python list of subtrees that represent the user structures in the
XML tree.
Output
What is JSON? Illustrate the concept of parsing JSON Python code
JSON: JavaScript Object Notation
The JSON format was inspired by the object and array format used in the JavaScript language.
But since Python was invented before JavaScript, Python’s syntax for dictionaries and lists
influenced the syntax of JSON. So the format of JSON is nearly identical to a combination of
Python lists and dictionaries.
Parsing JSON
API Application Program Interface - A contract between applications that defines the patterns of
interaction between two application components.
JSON: JavaScript Object Notation. A format that allows for the markup of structured data based
on the syntax of JavaScript Objects.
When we begin to build our programs where the functionality of our program includes access to
services provided by other programs, we call the approach a Service-oriented architecture
(SOA). A SOA approach is one where our overall application makes use of the services of other
applications. A non-SOA approach is where the application is a single standalone application
which contains all of the code necessary to implement the application.
We see many examples of SOA when we use the web. We can go to a single web site and book
air travel, hotels, and automobiles all from a single site. The data for hotels is not stored on the
airline computers. Instead, the airline computers contact the services on the hotel computers and
retrieve the hotel data and present it to the user. When the user agrees to make a hotel reservation
using the airline site, the airline site uses another web service on the hotel systems to actually
make the reservation. And when it comes time to charge your credit card for the whole
transaction, still other computers become involved in the process.
A Service-oriented architecture has many advantages, including: (1) we always maintain only
one copy of data (this is particularly important for things like hotel reservations where we do not
want to over-commit) and (2) the owners of the data can set the rules about the use of their data.
With these advantages, an SOA system must be carefully designed to have good performance
and meet the user’s needs.
Chapter 15
Using Databases and SQL
What is a database?
A database is a file that is organized for storing data. Most databases are organized like a
dictionary in the sense that they map from keys to values. The biggest difference is that the
database is on disk (or other permanent storage), so it persists after the program ends. Because a
database is stored on permanent storage, it can store far more data than a dictionary, which is
limited to the size of the memory in the computer.
There are many different database systems which are used for a wide variety of purposes
including: Oracle, MySQL, Microsoft SQL Server, PostgreSQL, and SQLite.
Database concepts
When you first look at a database it looks like a spreadsheet with multiple sheets. The primary
data structures in a database are: tables, rows, and columns.
In technical descriptions of relational databases the concepts of table, row, and column are more
formally referred to as relation, tuple, and attribute, respectively.
Database Browser for SQLite
Python to work with data in SQLite database files, many operations can be done more
conveniently using software called the Database Browser for SQLite
Using the browser you can easily create tables, insert data, edit data and run simple SQL queries
on the data in the database.
Code to create a database file and a table named Tracks with two columns in the database is as
follows:
Explanation of functions
cursor ( ) : It is like a File handle it is used to perform operation on the data stored in the
database.
execute ( ): Once we have the cursor, we can begin to execute commands on the contents of
the database using the execute() method
The database language is called Structured Query Language or SQL
The first SQL command removes the Tracks table from the database if it exists
The second command creates a table named Tracks with a text column named title and an integer
column named plays.
cur.execute('CREATE TABLE Tracks (title TEXT, plays INTEGER)')
we can put some data into that table using the SQL INSERT operation.
The SQL INSERT command indicates which table we are using and then defines a new row by
listing the fields we want to include (title, plays) followed by the VALUES we want placed in
the new row. We specify the values as question marks (?, ?) to indicate that the actual values are
passed in as a tuple ( 'My Way',15 ) as the second parameter to the execute() call.
The output of the program is as follows:
Tracks:
('Thunderstruck', 20)
('My Way', 15)
Spidering Twitter using a database
source code for our Twitter spidering application
Output:
Enter a Twitter account, or quit: drchuck
Retrieving http://api.twitter.com/1.1/friends ...
New accounts= 20 revisited= 0
Enter a Twitter account, or quit: quit