0% found this document useful (0 votes)
73 views124 pages

Python Network Programming: David M. Beazley

This document provides an introduction to a course on Python network programming. It will cover low-level socket programming, high-level client modules, common data encodings, simple web programming using HTTP, and simple distributed computing. The course materials and exercises are provided in a supporting zip file. The prerequisites are a basic knowledge of Python and general network concepts.

Uploaded by

reagan omondi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views124 pages

Python Network Programming: David M. Beazley

This document provides an introduction to a course on Python network programming. It will cover low-level socket programming, high-level client modules, common data encodings, simple web programming using HTTP, and simple distributed computing. The course materials and exercises are provided in a supporting zip file. The prerequisites are a basic knowledge of Python and general network concepts.

Uploaded by

reagan omondi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 124

Python Network Programming

David M. Beazley
http://www.dabeaz.com

Edition: Thu Jun 17 19:49:58 2010

Copyright (C) 2010


David M Beazley
All Rights Reserved
Python Network Programming : Table of Contents

1. Network Fundamentals 4
2. Client Programming 32
3. Internet Data Handling 49
4. Web Programming Basics 65
5. Advanced Networks 93

Edition: Thu Jun 17 19:49:58 2010


Slide Title Index Threaded Server 1-50
Forking Server (Unix) 1-51
Asynchronous Server 1-52
Utility Functions 1-53
0. Introduction Omissions 1-54
Discussion 1-55
Introduction 0-1
Support Files
Python Networking
0-2
0-3
2. Client Programming
This Course 0-4
Standard Library 0-5 Client Programming 2-1
Prerequisites 0-6 Overview 2-2
urllib Module 2-3
urllib protocols 2-5
1. Network Fundamentals HTML Forms 2-6
Web Services 2-8
Network Fundamentals 1-1 Parameter Encoding 2-9
The Problem 1-2 Sending Parameters 2-10
Two Main Issues 1-3 Response Data 2-12
Network Addressing 1-4 Response Headers 2-13
Standard Ports 1-5 Response Status 2-14
Using netstat 1-6 Exercise 2.1 2-15
Connections 1-7 urllib Limitations 2-16
Client/Server Concept 1-8 urllib2 Module 2-17
Request/Response Cycle 1-9 urllib2 Example 2-18
Using Telnet 1-10 urllib2 Requests 2-19
Data Transport 1-11 Requests with Data 2-20
Sockets 1-12 Request Headers 2-21
Socket Basics 1-13 urllib2 Error Handling 2-22
Socket Types 1-14 urllib2 Openers 2-23
Using a Socket 1-15 urllib2 build_opener() 2-24
TCP Client 1-16 Example : Login Cookies 2-25
Exercise 1.1 1-17 Discussion 2-26
Server Implementation 1-18 Exercise 2.2 2-27
TCP Server 1-19 Limitations 2-28
Exercise 1.2 1-27 ftplib 2-29
Advanced Sockets 1-28 Upload to a FTP Server 2-30
Partial Reads/Writes 1-29 httplib 2-31
Sending All Data 1-31 smtplib 2-32
End of Data 1-32 Exercise 2.3 2-33
Data Reassembly 1-33
Timeouts
Non-blocking Sockets
1-34
1-35
3. Internet Data Handling
Socket Options 1-36
Sockets as Files 1-37 Internet Data Handling 3-1
Exercise 1.3 1-39 Overview 3-2
Odds and Ends 1-40 CSV Files 3-3
UDP : Datagrams 1-41 Parsing HTML 3-4
UDP Server 1-42 Running a Parser 3-6
UDP Client 1-43 HTML Example 3-7
Unix Domain Sockets 1-44 XML Parsing with SAX 3-9
Raw Sockets 1-45 Brief XML Refresher 3-10
Sockets and Concurrency 1-46 SAX Parsing 3-11
Exercise 3.1 3-13 WSGI Example 4-37
XML and ElementTree 3-14 WSGI Applications 4-38
etree Parsing Basics 3-15 WSGI Environment 4-39
Obtaining Elements 3-17 Processing WSGI Inputs 4-41
Iterating over Elements 3-18 WSGI Responses 4-42
Element Attributes 3-19 WSGI Content 4-44
Search Wildcards 3-20 WSGI Content Encoding 4-45
cElementTree 3-22 WSGI Deployment 4-46
Tree Modification 3-23 WSGI and CGI 4-48
Tree Output 3-24 Exercise 4.5 4-49
Iterative Parsing 3-25 Customized HTTP 4-50
Exercise 3.2 3-28 Exercise 4.6 4-53
JSON 3-29 Web Frameworks 4-54
Sample JSON File 3-30 Commentary 4-56
Processing JSON Data 3-31
Exercise 3.3 3-32
5. Advanced Networking
4. Web Programming Advanced Networking 5-1
Overview 5-2
Web Programming Basics 4-1 Problem with Sockets 5-3
Introduction 4-2 SocketServer 5-4
Overview 4-3 SocketServer Example 5-5
Disclaimer 4-4 Execution Model 5-11
HTTP Explained 4-5 Exercise 5.1 5-12
HTTP Client Requests 4-6 Big Picture 5-13
HTTP Responses 4-7 Concurrent Servers 5-14
HTTP Protocol 4-8 Server Mixin Classes 5-15
Content Encoding 4-9 Server Subclassing 5-16
Payload Packaging 4-10 Exercise 5.2 5-17
Exercise 4.1 4-11 Distributed Computing 5-18
Role of Python 4-12 Discussion 5-19
Typical Python Tasks 4-13 XML-RPC 5-20
Content Generation 4-14 Simple XML-RPC 5-21
Example : Page Templates 4-15 XML-RPC Commentary 5-23
Commentary 4-17 XML-RPC and Binary 5-24
Exercise 4.2 4-18 Exercise 5.3 5-25
HTTP Servers 4-19 Serializing Python Objects 5-26
A Simple Web Server 4-20 pickle Module 5-27
Exercise 4.3 4-21 Pickling to Strings 5-28
A Web Server with CGI 4-22 Example 5-29
CGI Scripting 4-23 Miscellaneous Comments 5-31
CGI Example 4-24 Exercise 5.4 5-32
CGI Mechanics 4-27 multiprocessing 5-33
Classic CGI Interface 4-28 Connections 5-34
CGI Query Variables 4-29 Connection Use 5-35
cgi Module 4-30 Example 5-36
CGI Responses 4-31 Commentary 5-38
Note on Status Codes 4-32 What about... 5-40
CGI Commentary 4-33 Network Wrap-up 5-41
Exercise 4.4 4-34 Exercise 5.5 5-42
WSGI 4-35
WSGI Interface 4-36
Section 0

Introduction

Support Files
• Course exercises:
http://www.dabeaz.com/python/pythonnetwork.zip

• This zip file should be downloaded and extracted


someplace on your machine
• All of your work will take place in the the
"PythonNetwork" folder

Copyright (C) 2010, http://www.dabeaz.com 1- 2

1
Python Networking

• Network programming is a major use of Python


• Python standard library has wide support for
network protocols, data encoding/decoding, and
other things you need to make it work
• Writing network programs in Python tends to be
substantially easier than in C/C++

Copyright (C) 2010, http://www.dabeaz.com 1- 3

This Course
• This course focuses on the essential details of
network programming that all Python
programmers should probably know
• Low-level programming with sockets
• High-level client modules
• How to deal with common data encodings
• Simple web programming (HTTP)
• Simple distributed computing
Copyright (C) 2010, http://www.dabeaz.com 1- 4

2
Standard Library
• We will only cover modules supported by the
Python standard library
• These come with Python by default
• Keep in mind, much more functionality can be
found in third-party modules
• Will give links to notable third-party libraries as
appropriate

Copyright (C) 2010, http://www.dabeaz.com 1- 5

Prerequisites

• You should already know Python basics


• However, you don't need to be an expert on all
of its advanced features (in fact, none of the code
to be written is highly sophisticated)
• You should have some prior knowledge of
systems programming and network concepts

Copyright (C) 2010, http://www.dabeaz.com 1- 6

3
Section 1

Network Fundamentals

The Problem
• Communication between computers

Network

• It's just sending/receiving bits


Copyright (C) 2010, http://www.dabeaz.com 1- 2

4
Two Main Issues

• Addressing
• Specifying a remote computer and service
• Data transport
• Moving bits back and forth

Copyright (C) 2010, http://www.dabeaz.com 1- 3

Network Addressing
• Machines have a hostname and IP address
• Programs/services have port numbers
foo.bar.com
205.172.13.4

port 4521 www.python.org


Network 82.94.237.218

port 80

Copyright (C) 2010, http://www.dabeaz.com 1- 4

5
Standard Ports
• Ports for common services are preassigned
21 FTP
22 SSH
23 Telnet
25 SMTP (Mail)
80 HTTP (Web)
110 POP3 (Mail)
119 NNTP (News)
443 HTTPS (web)

• Other port numbers may just be randomly


assigned to programs by the operating system

Copyright (C) 2010, http://www.dabeaz.com 1- 5

Using netstat
• Use 'netstat' to view active network connections
shell % netstat -a
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 *:imaps *:* LISTEN
tcp 0 0 *:pop3s *:* LISTEN
tcp 0 0 localhost:mysql *:* LISTEN
tcp 0 0 *:pop3 *:* LISTEN
tcp 0 0 *:imap2 *:* LISTEN
tcp 0 0 *:8880 *:* LISTEN
tcp 0 0 *:www *:* LISTEN
tcp 0 0 192.168.119.139:domain *:* LISTEN
tcp 0 0 localhost:domain *:* LISTEN
tcp 0 0 *:ssh *:* LISTEN
...

• Note: Must execute from the command shell on


both Unix and Windows
Copyright (C) 2010, http://www.dabeaz.com 1- 6

6
Connections
• Each endpoint of a network connection is always
represented by a host and port #
• In Python you write it out as a tuple (host,port)
("www.python.org",80)
("205.172.13.4",443)

• In almost all of the network programs you’ll


write, you use this convention to specify a
network address

Copyright (C) 2010, http://www.dabeaz.com 1- 7

Client/Server Concept
• Each endpoint is a running program
• Servers wait for incoming connections and
provide a service (e.g., web, mail, etc.)
• Clients make connections to servers
Client Server
www.bar.com
205.172.13.4

browser web Port 80

Copyright (C) 2010, http://www.dabeaz.com 1- 8

7
Request/Response Cycle
• Most network programs use a request/
response model based on messages
• Client sends a request message (e.g., HTTP)
GET /index.html HTTP/1.0

• Server sends back a response message


HTTP/1.0 200 OK
Content-type: text/html
Content-length: 48823

<HTML>
...

• The exact format depends on the application


Copyright (C) 2010, http://www.dabeaz.com 1- 9

Using Telnet
• As a debugging aid, telnet can be used to
directly communicate with many services
telnet hostname portnum

• Example:
shell % telnet www.python.org 80
Trying 82.94.237.218...
Connected to www.python.org.
type this Escape character is '^]'.
and press GET /index.html HTTP/1.0
return a few
times HTTP/1.1 200 OK
Date: Mon, 31 Mar 2008 13:34:03 GMT
Server: Apache/2.2.3 (Debian) DAV/2 SVN/1.4.2
mod_ssl/2.2.3 OpenSSL/0.9.8c
...

Copyright (C) 2010, http://www.dabeaz.com 1- 10

8
Data Transport
• There are two basic types of communication
• Streams (TCP): Computers establish a
connection with each other and read/write data
in a continuous stream of bytes---like a file. This
is the most common.
• Datagrams (UDP): Computers send discrete
packets (or messages) to each other. Each
packet contains a collection of bytes, but each
packet is separate and self-contained.

Copyright (C) 2010, http://www.dabeaz.com 1- 11

Sockets
• Programming abstraction for network code
• Socket: A communication endpoint
socket socket
network

• Supported by socket library module


• Allows connections to be made and data to be
transmitted in either direction
Copyright (C) 2010, http://www.dabeaz.com 1- 12

9
Socket Basics
• To create a socket
import socket
s = socket.socket(addr_family, type)

• Address families
socket.AF_INET Internet protocol (IPv4)
socket.AF_INET6 Internet protocol (IPv6)

• Socket types
socket.SOCK_STREAM Connection based stream (TCP)
socket.SOCK_DGRAM Datagrams (UDP)

• Example:
from socket import *
s = socket(AF_INET,SOCK_STREAM)

Copyright (C) 2010, http://www.dabeaz.com 1- 13

Socket Types
• Almost all code will use one of following
from socket import *

s = socket(AF_INET, SOCK_STREAM)
s = socket(AF_INET, SOCK_DGRAM)

• Most common case: TCP connection


s = socket(AF_INET, SOCK_STREAM)

Copyright (C) 2010, http://www.dabeaz.com 1- 14

10
Using a Socket
• Creating a socket is only the first step
s = socket(AF_INET, SOCK_STREAM)

• Further use depends on application


• Server
• Listen for incoming connections
• Client
• Make an outgoing connection
Copyright (C) 2010, http://www.dabeaz.com 1- 15

TCP Client
• How to make an outgoing connection
from socket import *
s = socket(AF_INET,SOCK_STREAM)
s.connect(("www.python.org",80)) # Connect
s.send("GET /index.html HTTP/1.0\n\n") # Send request
data = s.recv(10000) # Get response
s.close()

• s.connect(addr) makes a connection


s.connect(("www.python.org",80))

• Once connected, use send(),recv() to


transmit and receive data
• close() shuts down the connection
Copyright (C) 2010, http://www.dabeaz.com 1- 16

11
Exercise 1.1

Time : 10 Minutes

Copyright (C) 2010, http://www.dabeaz.com 1- 17

Server Implementation

• Network servers are a bit more tricky


• Must listen for incoming connections on a
well-known port number
• Typically run forever in a server-loop
• May have to service multiple clients

Copyright (C) 2010, http://www.dabeaz.com 1- 18

12
TCP Server
• A simple server
from socket import *
s = socket(AF_INET,SOCK_STREAM)
s.bind(("",9000))
s.listen(5)
while True:
c,a = s.accept()
print "Received connection from", a
c.send("Hello %s\n" % a[0])
c.close()

• Send a message back to a client


% telnet localhost 9000
Connected to localhost.
Escape character is '^]'.
Hello 127.0.0.1 Server message
Connection closed by foreign host.
%

Copyright (C) 2010, http://www.dabeaz.com 1- 19

TCP Server
• Address binding
from socket import *
s = socket(AF_INET,SOCK_STREAM)
s.bind(("",9000))
binds the socket to
s.listen(5) a specific address
while True:
c,a = s.accept()
print "Received connection from", a
c.send("Hello %s\n" % a[0])
c.close()

• Addressing binds to localhost


s.bind(("",9000))
s.bind(("localhost",9000)) If system has multiple
s.bind(("192.168.2.1",9000))
s.bind(("104.21.4.2",9000))
IP addresses, can bind
to a specific address

Copyright (C) 2010, http://www.dabeaz.com 1- 20

13
TCP Server
• Start listening for connections
from socket import *
s = socket(AF_INET,SOCK_STREAM)
s.bind(("",9000)) Tells operating system to
s.listen(5)
while True:
start listening for
c,a = s.accept() connections on the socket
print "Received connection from", a
c.send("Hello %s\n" % a[0])
c.close()

• s.listen(backlog)
• backlog is # of pending connections to allow
• Note: not related to max number of clients
Copyright (C) 2010, http://www.dabeaz.com 1- 21

TCP Server
• Accepting a new connection
from socket import *
s = socket(AF_INET,SOCK_STREAM)
s.bind(("",9000))
s.listen(5)
while True:
c,a = s.accept() Accept a new client connection
print "Received connection from", a
c.send("Hello %s\n" % a[0])
c.close()

• s.accept() blocks until connection received


• Server sleeps if nothing is happening

Copyright (C) 2010, http://www.dabeaz.com 1- 22

14
TCP Server
• Client socket and address
from socket import *
s = socket(AF_INET,SOCK_STREAM)
s.bind(("",9000))
s.listen(5)
while True:
Accept returns a pair (client_socket,addr)
c,a = s.accept()
print "Received connection from", a
c.send("Hello %s\n" % a[0])
c.close()

<socket._socketobject ("104.23.11.4",27743)
object at 0x3be30>
This is the network/port
This is a new socket address of the client that
that's used for data connected

Copyright (C) 2010, http://www.dabeaz.com 1- 23

TCP Server
• Sending data
from socket import *
s = socket(AF_INET,SOCK_STREAM)
s.bind(("",9000))
s.listen(5)
while True:
c,a = s.accept()
print "Received connection from", a
c.send("Hello %s\n" % a[0]) Send data to client
c.close()

Note: Use the client socket for


transmitting data. The server
socket is only used for
accepting new connections.

Copyright (C) 2010, http://www.dabeaz.com 1- 24

15
TCP Server
• Closing the connection
from socket import *
s = socket(AF_INET,SOCK_STREAM)
s.bind(("",9000))
s.listen(5)
while True:
c,a = s.accept()
print "Received connection from", a
c.send("Hello %s\n" % a[0])
c.close() Close client connection

• Note: Server can keep client connection alive


as long as it wants
• Can repeatedly receive/send data
Copyright (C) 2010, http://www.dabeaz.com 1- 25

TCP Server
• Waiting for the next connection
from socket import *
s = socket(AF_INET,SOCK_STREAM)
s.bind(("",9000))
s.listen(5)
while True:
c,a = s.accept() Wait for next connection
print "Received connection from", a
c.send("Hello %s\n" % a[0])
c.close()

• Original server socket is reused to listen for


more connections
• Server runs forever in a loop like this
Copyright (C) 2010, http://www.dabeaz.com 1- 26

16
Exercise 1.2

Time : 20 Minutes

Copyright (C) 2010, http://www.dabeaz.com 1- 27

Advanced Sockets
• Socket programming is often a mess
• Huge number of options
• Many corner cases
• Many failure modes/reliability issues
• Will briefly cover a few critical issues

Copyright (C) 2010, http://www.dabeaz.com 1- 28

17
Partial Reads/Writes
• Be aware that reading/writing to a socket
may involve partial data transfer
• send() returns actual bytes sent
• recv() length is only a maximum limit
>>> len(data)
1000000
>>> s.send(data)
37722 Sent partial data
>>>

>>> data = s.recv(10000)


>>> len(data)
6420 Received less than max
>>>

Copyright (C) 2010, http://www.dabeaz.com 1- 29

Partial Reads/Writes
• Be aware that for TCP, the data stream is
continuous---no concept of records, etc.
# Client
...
s.send(data)
s.send(moredata)
...

# Server This recv() may return data


... from both of the sends
data = s.recv(maxsize)
...
combined or less data than
even the first send

• A lot depends on OS buffers, network


bandwidth, congestion, etc.
Copyright (C) 2010, http://www.dabeaz.com 1- 30

18
Sending All Data
• To wait until all data is sent, use sendall()
s.sendall(data)

• Blocks until all data is transmitted


• For most normal applications, this is what
you should use
• Exception :You don’t use this if networking is
mixed in with other kinds of processing
(e.g., screen updates, multitasking, etc.)

Copyright (C) 2010, http://www.dabeaz.com 1- 31

End of Data
• How to tell if there is no more data?
• recv() will return empty string
>>> s.recv(1000)
''
>>>

• This means that the other end of the


connection has been closed (no more sends)

Copyright (C) 2010, http://www.dabeaz.com 1- 32

19
Data Reassembly
• Receivers often need to reassemble
messages from a series of small chunks
• Here is a programming template for that
fragments = [] # List of chunks
while not done:
chunk = s.recv(maxsize) # Get a chunk
if not chunk:
break # EOF. No more data
fragments.append(chunk)

# Reassemble the message


message = "".join(fragments)

• Don't use string concat (+=). It's slow.


Copyright (C) 2010, http://www.dabeaz.com 1- 33

Timeouts
• Most socket operations block indefinitely
• Can set an optional timeout
s = socket(AF_INET, SOCK_STREAM)
...
s.settimeout(5.0) # Timeout of 5 seconds
...

• Will get a timeout exception


>>> s.recv(1000)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
socket.timeout: timed out
>>>

• Disabling timeouts
s.settimeout(None)

Copyright (C) 2010, http://www.dabeaz.com 1- 34

20
Non-blocking Sockets
• Instead of timeouts, can set non-blocking
>>> s.setblocking(False)

• Future send(),recv() operations will raise an


exception if the operation would have blocked
>>> s.setblocking(False)
>>> s.recv(1000) No data available
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
socket.error: (35, 'Resource temporarily unavailable')
>>> s.recv(1000) Data arrived
'Hello World\n'
>>>

• Sometimes used for polling


Copyright (C) 2010, http://www.dabeaz.com 1- 35

Socket Options
• Sockets have a large number of parameters
• Can be set using s.setsockopt()
• Example: Reusing the port number
>>> s.bind(("",9000))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 1, in bind
socket.error: (48, 'Address already in use')
>>> s.setsockopt(socket.SOL_SOCKET,
... socket.SO_REUSEADDR, 1)
>>> s.bind(("",9000))
>>>

• Consult reference for more options


Copyright (C) 2010, http://www.dabeaz.com 1- 36

21
Sockets as Files
• Sometimes it is easier to work with sockets
represented as a "file" object
f = s.makefile()

• This will wrap a socket with a file-like API


f.read()
f.readline()
f.write()
f.writelines()
for line in f:
...
f.close()

Copyright (C) 2010, http://www.dabeaz.com 1- 37

Sockets as Files
• Commentary : From personal experience,
putting a file-like layer over a socket rarely
works as well in practice as it sounds in theory.
• Tricky resource management (must manage
both the socket and file independently)
• It's easy to write programs that mysteriously
"freeze up" or don't operate quite like you
would expect.

Copyright (C) 2010, http://www.dabeaz.com 1- 38

22
Exercise 1.3

Time : 15 Minutes

Copyright (C) 2010, http://www.dabeaz.com 1- 39

Odds and Ends


• Other supported socket types
• Datagram (UDP) sockets
• Unix domain sockets
• Raw sockets/Packets
• Sockets and concurrency
• Useful utility functions

Copyright (C) 2010, http://www.dabeaz.com 1- 40

23
UDP : Datagrams
DATA DATA DATA

• Data sent in discrete packets (Datagrams)


• No concept of a "connection"
• No reliability, no ordering of data
• Datagrams may be lost, arrive in any order
• Higher performance (used in games, etc.)
Copyright (C) 2010, http://www.dabeaz.com 1- 41

UDP Server
• A simple datagram server
from socket import *
s = socket(AF_INET,SOCK_DGRAM) Create datagram socket
s.bind(("",10000)) Bind to a specific port
while True:
data, addr = s.recvfrom(maxsize) Wait for a message
resp = "Get off my lawn!"
s.sendto(resp,addr) Send response
(optional)

• No "connection" is established
• It just sends and receives packets
Copyright (C) 2010, http://www.dabeaz.com 1- 42

24
UDP Client
• Sending a datagram to a server
from socket import *
s = socket(AF_INET,SOCK_DGRAM) Create datagram socket

msg = "Hello World"


s.sendto(msg,("server.com",10000)) Send a message
data, addr = s.recvfrom(maxsize)
Wait for a response
(optional)
returned data remote address

• Key concept: No "connection"


• You just send a data packet
Copyright (C) 2010, http://www.dabeaz.com 1- 43

Unix Domain Sockets


• Available on Unix based systems. Sometimes
used for fast IPC or pipes between processes
• Creation:
s = socket(AF_UNIX, SOCK_STREAM)
s = socket(AF_UNIX, SOCK_DGRAM)

• Address is just a "filename"


s.bind("/tmp/foo") # Server binding
s.connect("/tmp/foo") # Client connection

• Rest of the programming interface is the same


Copyright (C) 2010, http://www.dabeaz.com 1- 44

25
Raw Sockets
• If you have root/admin access, can gain direct
access to raw network packets
• Depends on the system
• Example: Linux packet sniffing
s = socket(AF_PACKET, SOCK_DGRAM)
s.bind(("eth0",0x0800)) # Sniff IP packets

while True:
msg,addr = s.recvfrom(4096) # get a packet
...

Copyright (C) 2010, http://www.dabeaz.com 1- 45

Sockets and Concurrency


• Servers usually handle multiple clients
clients server

browser
web Port 80

web
web
browser

Copyright (C) 2010, http://www.dabeaz.com 1- 46

26
Sockets and Concurrency
• Each client gets its own socket on server
# server code
clients
s = socket(AF_INET, SOCK_STREAM) server
...
while True:
c,a = s.accept()
... browser
web a connection
point for clients
web
web client data
transmitted
browser
on a different
socket

Copyright (C) 2010, http://www.dabeaz.com 1- 47

Sockets and Concurrency


• New connections make a new socket
clients server

browser
web Port 80

web
connect accept()
web
browser web

send()/recv()

browser
Copyright (C) 2010, http://www.dabeaz.com 1- 48

27
Sockets and Concurrency
• To manage multiple clients,
• Server must always be ready to accept
new connections
• Must allow each client to operate
independently (each may be performing
different tasks on the server)
• Will briefly outline the common solutions

Copyright (C) 2010, http://www.dabeaz.com 1- 49

Threaded Server
• Each client is handled by a separate thread
import threading
from socket import *

def handle_client(c):
... whatever ...
c.close()
return

s = socket(AF_INET,SOCK_STREAM)
s.bind(("",9000))
s.listen(5)
while True:
c,a = s.accept()
t = threading.Thread(target=handle_client,
args=(c,))

Copyright (C) 2010, http://www.dabeaz.com 1- 50

28
Forking Server (Unix)
• Each client is handled by a subprocess
import os
from socket import *
s = socket(AF_INET,SOCK_STREAM)
s.bind(("",9000))
s.listen(5)
while True:
c,a = s.accept()
if os.fork() == 0:
# Child process. Manage client
...
c.close()
os._exit(0)
else:
# Parent process. Clean up and go
# back to wait for more connections
c.close()

• Note: Omitting some critical details


Copyright (C) 2010, http://www.dabeaz.com 1- 51

Asynchronous Server
• Server handles all clients in an event loop
import select
from socket import *
s = socket(AF_INET,SOCK_STREAM)
...
clients = [] # List of all active client sockets
while True:
# Look for activity on any of my sockets
input,output,err = select.select(s+clients,
clients, clients)
# Process all sockets with input
for i in input:
...
# Process all sockets ready for output
for o in output:
...

• Frameworks such as Twisted build upon this


Copyright (C) 2010, http://www.dabeaz.com 1- 52

29
Utility Functions
• Get the hostname of the local machine
>>> socket.gethostname()
'foo.bar.com'
>>>

• Get the IP address of a remote machine


>>> socket.gethostbyname("www.python.org")
'82.94.237.218'
>>>

• Get name information on a remote IP


>>> socket.gethostbyaddr("82.94.237.218")
('dinsdale.python.org', [], ['82.94.237.218'])
>>>

Copyright (C) 2010, http://www.dabeaz.com 1- 53

Omissions
• socket module has hundreds of obscure
socket control options, flags, etc.
• Many more utility functions
• IPv6 (Supported, but new and hairy)
• Other socket types (SOCK_RAW, etc.)
• More on concurrent programming (covered in
advanced course)

Copyright (C) 2010, http://www.dabeaz.com 1- 54

30
Discussion
• It is often unnecessary to directly use sockets
• Other library modules simplify use
• However, those modules assume some
knowledge of the basic concepts (addresses,
ports, TCP, UDP, etc.)
• Will see more in the next few sections...

Copyright (C) 2010, http://www.dabeaz.com 1- 55

31
Section 2

Client Programming

Overview

• Python has library modules for interacting with


a variety of standard internet services
• HTTP, FTP, SMTP, NNTP, XML-RPC, etc.
• In this section we're going to look at how some
of these library modules work
• Main focus is on the web (HTTP)

Copyright (C) 2010, http://www.dabeaz.com 2- 2

32
urllib Module
• A high level module that allows clients to
connect a variety of internet services
• HTTP
• HTTPS
• FTP
• Local files
• Works with typical URLs on the web...
Copyright (C) 2010, http://www.dabeaz.com 2- 3

urllib Module
• Open a web page: urlopen()
>>> import urllib
>>> u = urllib.urlopen("http://www.python/org/index.html")
>>> data = u.read()
>>> print data
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML ...
...
>>>

• urlopen() returns a file-like object


• Read from it to get downloaded data

Copyright (C) 2010, http://www.dabeaz.com 2- 4

33
urllib protocols

• Supported protocols
u = urllib.urlopen("http://www.foo.com")
u = urllib.urlopen("https://www.foo.com/private")
u = urllib.urlopen("ftp://ftp.foo.com/README")
u = urllib.urlopen("file:///Users/beazley/blah.txt")

• Note: HTTPS only supported if Python


configured with support for OpenSSL

Copyright (C) 2010, http://www.dabeaz.com 2- 5

HTML Forms
• One use of urllib is to automate forms

• Example HTML source for the form


<FORM ACTION="/subscribe" METHOD="POST">
Your name: <INPUT type="text" name="name" size="30"><br>
Your email: <INPUT type="text" name="email" size="30"><br>
<INPUT type="submit" name="submit-button" value="Subscribe">

Copyright (C) 2010, http://www.dabeaz.com 2- 6

34
HTML Forms
• Within the form, you will find an action and
named parameters for the form fields
<FORM ACTION="/subscribe" METHOD="POST">
Your name: <INPUT type="text" name="name" size="30"><br>
Your email: <INPUT type="text" name="email" size="30"><br>
<INPUT type="submit" name="submit-button" value="Subscribe">

• Action (a URL)
http://somedomain.com/subscribe

• Parameters:
name
email

Copyright (C) 2010, http://www.dabeaz.com 2- 7

Web Services
• Another use of urllib is to access web services
• Downloading maps
• Stock quotes
• Email messages
• Most of these are controlled and accessed in
the same manner as a form
• There is a particular request and expected set
of parameters for different operations

Copyright (C) 2010, http://www.dabeaz.com 2- 8

35
Parameter Encoding
• urlencode()
• Takes a dictionary of fields and creates a
URL-encoded string of parameters
fields = {
'name' : 'Dave',
'email' : '[email protected]'
}

parms = urllib.urlencode(fields)

• Sample result
>>> parms
'name=Dave&email=dave%40dabeaz.com'
>>>

Copyright (C) 2010, http://www.dabeaz.com 2- 9

Sending Parameters
• Case 1 : GET Requests
<FORM ACTION="/subscribe" METHOD="GET">
Your name: <INPUT type="text" name="name" size="30"><br>
Your email: <INPUT type="text" name="email" size="30"><br>
<INPUT type="submit" name="submit-button" value="Subscribe">

• Example code:
fields = { ... }
parms = urllib.urlencode(fields)
u = urllib.urlopen("http://somedomain.com/subscribe?"+parms)

You create a long URL by concatenating


the request with the parameters

http://somedomain.com/subscribe?name=Dave&email=dave%40dabeaz.com

Copyright (C) 2010, http://www.dabeaz.com 2- 10

36
Sending Parameters
• Case 2 : POST Requests
<FORM ACTION="/subscribe" METHOD="POST">
Your name: <INPUT type="text" name="name" size="30"><br>
Your email: <INPUT type="text" name="email" size="30"><br>
<INPUT type="submit" name="submit-button" value="Subscribe">

• Example code:
fields = { ... }
parms = urllib.urlencode(fields)
u = urllib.urlopen("http://somedomain.com/subscribe", parms)

Parameters get uploaded separately


as part of the request body
POST /subscribe HTTP/1.0
...
name=Dave&email=dave%40dabeaz.com
Copyright (C) 2010, http://www.dabeaz.com 2- 11

Response Data
• To read response data, treat the result of
urlopen() as a file object
>>> u = urllib.urlopen("http://www.python.org")
>>> data = u.read()
>>>

• Be aware that the response data consists of


the raw bytes transmitted
• If there is any kind of extra encoding (e.g.,
Unicode), you will need to decode the data
with extra processing steps.

Copyright (C) 2010, http://www.dabeaz.com 2- 12

37
Response Headers
• HTTP headers are retrieved using .info()
>>> u = urllib.urlopen("http://www.python.org")
>>> headers = u.info()
>>> headers
<httplib.HTTPMessage instance at 0x1118828>
>>> headers.keys()
['content-length', 'accept-ranges', 'server',
'last-modified', 'connection', 'etag', 'date',
'content-type']
>>> headers['content-length']
'13597'
>>> headers['content-type']
'text/html'
>>>

• A dictionary-like object
Copyright (C) 2010, http://www.dabeaz.com 2- 13

Response Status
• urlopen() ignores HTTP status codes (i.e.,
errors are silently ignored)
• Can manually check the response code
u = urllib.urlopen("http://www.python.org/java")
if u.code == 200:
# success
...
elif u.code == 404:
# Not found!
...
elif u.code == 403:
# Forbidden
...

• Unfortunately a little clumsy (fixed shortly)


Copyright (C) 2010, http://www.dabeaz.com 2- 14

38
Exercise 2.1

Time : 15 Minutes

Copyright (C) 2010, http://www.dabeaz.com 2- 15

urllib Limitations

• urllib only works with simple cases


• Does not support cookies
• Does not support authentication
• Does not report HTTP errors gracefully
• Only supports GET/POST requests

Copyright (C) 2010, http://www.dabeaz.com 2- 16

39
urllib2 Module

• urllib2 - The sequel to urllib


• Builds upon and expands urllib
• Can interact with servers that require
cookies, passwords, and other details
• Better error handling (uses exceptions)
• Is the preferred library for modern code
Copyright (C) 2010, http://www.dabeaz.com 2- 17

urllib2 Example
• urllib2 provides urlopen() as before
>>> import urllib2
>>> u = urllib2.urlopen("http://www.python.org/index.html")
>>> data = u.read()
>>>

• However, the module expands functionality


in two primary areas
• Requests
• Openers
Copyright (C) 2010, http://www.dabeaz.com 2- 18

40
urllib2 Requests
• Requests are now objects
>>> r = urllib2.Request("http://www.python.org")
>>> u = urllib2.urlopen(r)
>>> data = u.read()

• Requests can have additional attributes added


• User data (for POST requests)
• Customized HTTP headers

Copyright (C) 2010, http://www.dabeaz.com 2- 19

Requests with Data


• Create a POST request with user data
data = {
'name' : 'dave',
'email' : '[email protected]'
}

r = urllib2.Request("http://somedomain.com/subscribe",
urllib.urlencode(data))
u = urllib2.urlopen(r)
response = u.read()

• Note :You still use urllib.urlencode() from the


older urllib library
Copyright (C) 2010, http://www.dabeaz.com 2- 20

41
Request Headers
• Adding/Modifying client HTTP headers
headers = {
'User-Agent' : 'Mozilla/4.0 (compatible; MSIE 7.0;
Windows NT 5.1; .NET CLR 2.0.50727)'
}

r = urllib2.Request("http://somedomain.com/",
headers=headers)
u = urllib2.urlopen(r)
response = u.read()

• This can be used if you need to emulate a


specific client (e.g., Internet Explorer, etc.)

Copyright (C) 2010, http://www.dabeaz.com 2- 21

urllib2 Error Handling


• HTTP Errors are reported as exceptions
>>> u = urllib2.urlopen("http://www.python.org/perl")
Traceback...
urllib2.HTTPError: HTTP Error 404: Not Found
>>>

• Catching an error
try:
u = urllib2.urlopen(url)
except urllib2.HTTPError,e:
code = e.code # HTTP error code

• Note: urllib2 automatically tries to handle


redirection and certain HTTP responses

Copyright (C) 2010, http://www.dabeaz.com 2- 22

42
urllib2 Openers
• The function urlopen() is an "opener"
• It knows how to open a connection, interact
with the server, and return a response.
• It only has a few basic features---it does not
know how to deal with cookies and passwords
• However, you can make your own opener
objects with these features enabled

Copyright (C) 2010, http://www.dabeaz.com 2- 23

urllib2 build_opener()
• build_opener() makes an custom opener
# Make a URL opener with cookie support
opener = urllib2.build_opener(
urllib2.HTTPCookieProcessor()
)
u = opener.open("http://www.python.org/index.html")

• Can add a set of new features from this list


CacheFTPHandler
HTTPBasicAuthHandler
HTTPCookieProcessor
HTTPDigestAuthHandler
ProxyHandler
ProxyBasicAuthHandler
ProxyDigestAuthHandler

Copyright (C) 2010, http://www.dabeaz.com 2- 24

43
Example : Login Cookies
fields = {
'txtUsername' : 'dave',
'txtPassword' : '12345',
'submit_login' : 'Log In'
}
opener = urllib2.build_opener(
urllib2.HTTPCookieProcessor()
)
request = urllib2.Request(
"http://somedomain.com/login.asp",
urllib.urlencode(fields))

# Login
u = opener.open(request)
resp = u.read()

# Get a page, but use cookies returned by initial login


u = opener.open("http://somedomain.com/private.asp")
resp = u.read()

Copyright (C) 2010, http://www.dabeaz.com 2- 25

Discussion

• urllib2 module has a huge number of options


• Different configurations
• File formats, policies, authentication, etc.
• Will have to consult reference for everything

Copyright (C) 2010, http://www.dabeaz.com 2- 26

44
Exercise 2.2

Time : 15 Minutes

Password: guido456

Copyright (C) 2010, http://www.dabeaz.com 2- 27

Limitations
• urllib and urllib2 are useful for fetching files
• However, neither module provides support for
more advanced operations
• Examples:
• Uploading to an FTP server
• File-upload via HTTP Post
• Other HTTP methods (e.g., HEAD, PUT)
Copyright (C) 2010, http://www.dabeaz.com 2- 28

45
ftplib
• A module for interacting with FTP servers
• Example : Capture a directory listing
>>> import ftplib
>>> f = ftplib.FTP("ftp.gnu.org","anonymous",
... "[email protected]")
>>> files = []
>>> f.retrlines("LIST",files.append)
'226 Directory send OK.'
>>> len(files)
15
>>> files[0]
'-rw-r--r-- 1 0 0 1765 Feb 20 16:47 README'
>>>

Copyright (C) 2010, http://www.dabeaz.com 2- 29

Upload to a FTP Server


host = "ftp.foo.com"
username = "dave"
password = "1235"
filename = "somefile.dat"

import ftplib
ftp_serv = ftplib.FTP(host,username,password)

# Open the file you want to send


f = open(filename,"rb")

# Send it to the FTP server


resp = ftp_serv.storbinary("STOR "+filename, f)

# Close the connection


ftp_serv.close()

Copyright (C) 2010, http://www.dabeaz.com 2- 30

46
httplib
• A module for implementing the client side of an
HTTP connection
import httplib
c = httplib.HTTPConnection("www.python.org",80)
c.putrequest("HEAD","/tut/tut.html")
c.putheader("Someheader","Somevalue")
c.endheaders()

r = c.getresponse()
data = r.read()
c.close()

• Low-level control over HTTP headers, methods,


data transmission, etc.

Copyright (C) 2010, http://www.dabeaz.com 2- 31

smtplib
• A module for sending email messages
import smtplib
serv = smtplib.SMTP()
serv.connect()

msg = """\
From: [email protected]
To: [email protected]
Subject: Get off my lawn!

Blah blah blah"""

serv.sendmail("[email protected]",['[email protected]'],msg)

• Useful if you want to have a program send you a


notification, send email to customers, etc.
Copyright (C) 2010, http://www.dabeaz.com 2- 32

47
Exercise 2.3

Time : 15 Minutes

Copyright (C) 2010, http://www.dabeaz.com 2- 33

48
Section 3

Internet Data Handling

Overview

• If you write network clients, you will have to


worry about a variety of common file formats
• CSV, HTML, XML, JSON, etc.
• In this section, we briefly look at library
support for working with such data

Copyright (C) 2010, http://www.dabeaz.com 3- 2

49
CSV Files
• Comma Separated Values
Elwood,Blues,"1060 W Addison,Chicago 60637",110
McGurn,Jack,"4902 N Broadway,Chicago 60640",200

• Parsing with the CSV module


import csv
f = open("schmods.csv","r")
for row in csv.reader(f):
# Do something with items in row
...

• Understands quoting, various subtle details

Copyright (C) 2010, http://www.dabeaz.com 3- 3

Parsing HTML

• Suppose you want to parse HTML (maybe


obtained via urlopen)
• Use the HTMLParser module
• A library that processes HTML using an
"event-driven" programming style

Copyright (C) 2010, http://www.dabeaz.com 3- 4

50
Parsing HTML
• Define a class that inherits from HTMLParser
and define a set of methods that respond to
different document features
from HTMLParser import HTMLParser
class MyParser(HTMLParser):
def handle_starttag(self,tag,attrs):
...
def handle_data(self,data):
...
def handle_endtag(self,tag):
...

starttag data endttag

<tag attr="value" attr="value">data</tag>

Copyright (C) 2010, http://www.dabeaz.com 3- 5

Running a Parser
• To run the parser, you create a parser object
and feed it some data
# Fetch a web page
import urllib
u = urllib.urlopen("http://www.example.com")
data = u.read()

# Run it through the parser


p = MyParser()
p.feed(data)

• The parser will scan through the data and


trigger the various handler methods

Copyright (C) 2010, http://www.dabeaz.com 3- 6

51
HTML Example
• An example: Gather all links
from HTMLParser import HTMLParser
class GatherLinks(HTMLParser):
def __init__(self):
HTMLParser.__init__(self)
self.links = []
def handle_starttag(self,tag,attrs):
if tag == 'a':
for name,value in attrs:
if name == 'href':
self.links.append(value)

Copyright (C) 2010, http://www.dabeaz.com 3- 7

HTML Example
• Running the parser
>>> parser = GatherLinks()
>>> import urllib
>>> data = urllib.urlopen("http://www.python.org").read()
>>> parser.feed(data)
>>> for x in parser.links:
... print x
/search/
/about
/news/
/doc/
/download/
...
>>>

Copyright (C) 2010, http://www.dabeaz.com 3- 8

52
XML Parsing with SAX

• The event-driven style used by HTMLParser is


sometimes used to parse XML
• Basis of the SAX parsing interface
• An approach sometimes seen when dealing
with large XML documents since it allows for
incremental processing

Copyright (C) 2010, http://www.dabeaz.com 3- 9

Brief XML Refresher


• XML documents use structured markup
<contact>
<name>Elwood Blues</name>
<address>1060 W Addison</address>
<city>Chicago</city>
<zip>60616</zip>
</contact>

• Documents made up of elements


<name>Elwood Blues</name>

• Elements have starting/ending tags


• May contain text and other elements
Copyright (C) 2010, http://www.dabeaz.com 3- 10

53
SAX Parsing
• Define a special handler class
import xml.sax

class MyHandler(xml.sax.ContentHandler):
def startDocument(self):
print "Document start"
def startElement(self,name,attrs):
print "Start:", name
def characters(self,text):
print "Characters:", text
def endElement(self,name):
print "End:", name

• In the class, you define methods that capture


elements and other parts of the document

Copyright (C) 2010, http://www.dabeaz.com 3- 11

SAX Parsing
• To parse a document, you create an instance
of the handler and give it to the parser
# Create the handler object
hand = MyHandler()

# Parse a document using the handler


xml.sax.parse("data.xml",hand)

• This reads the file and calls handler methods


as different document elements are
encountered (start tags, text, end tags, etc.)

Copyright (C) 2010, http://www.dabeaz.com 3- 12

54
Exercise 3.1

Time : 15 Minutes

Copyright (C) 2010, http://www.dabeaz.com 3- 13

XML and ElementTree

• xml.etree.ElementTree module is one of


the easiest ways to parse XML
• Lets look at the highlights

Copyright (C) 2010, http://www.dabeaz.com 3- 14

55
etree Parsing Basics
• Parsing a document
from xml.etree.ElementTree import parse
doc = parse("recipe.xml")

• This builds a complete parse tree of the


entire document
• To extract data, you will perform various
kinds of queries on the document object

Copyright (C) 2010, http://www.dabeaz.com 3- 15

etree Parsing Basics


• A mini-reference for extracting data
• Finding one or more elements
elem = doc.find("title")
for elem in doc.findall("ingredients/item"):
statements

• Element attributes and properties


elem.tag # Element name
elem.text # Element text
elem.get(aname [,default]) # Element attributes

Copyright (C) 2010, http://www.dabeaz.com 3- 16

56
Obtaining Elements
<?xml version="1.0" encoding="iso-8859-1"?>
<recipe>
<title>Famous Guacamole</title>
<description>
A southwest favorite!
</description>
<ingredients>
<item num="2">Large avocados, chopped</item>
doc =chopped</item>
<item num="1">Tomato, parse("recipe.xml")
desc_elem = doc.find("description")
<item num="1/2" units="C">White onion, chopped</item>
<item num="1" units="tbl">Fresh squeezed lemon juice</item>
desc_text = desc_elem.text
<item num="1">Jalapeno pepper, diced</item>
or
<item num="1" units="tbl">Fresh cilantro, minced</item>
<item num="3" units="tsp">Sea Salt</item>
doc = parse("recipe.xml")
<item num="6" units="bottles">Ice-cold beer</item>
</ingredients> desc_text = doc.findtext("description")
<directions>
Combine all ingredients and hand whisk to desired consistency.
Serve and enjoy with ice-cold beers.
</directions>
</recipe>

Copyright (C) 2010, http://www.dabeaz.com 3- 17

Iterating over Elements


<?xml version="1.0" encoding="iso-8859-1"?>
<recipe> doc = parse("recipe.xml")
for item in doc.findall("ingredients/item"):
<title>Famous Guacamole</title>
<description> statements
A southwest favorite!
</description>
<ingredients>
<item num="2">Large avocados, chopped</item>
<item num="1">Tomato, chopped</item>
<item num="1/2" units="C">White onion, chopped</item>
<item num="1" units="tbl">Fresh squeezed lemon juice</item>
<item num="1">Jalapeno pepper, diced</item>
<item num="1" units="tbl">Fresh cilantro, minced</item>
<item num="3" units="tsp">Sea Salt</item>
<item num="6" units="bottles">Ice-cold beer</item>
</ingredients>
<directions>
Combine all ingredients and hand whisk to desired consistency.
Serve and enjoy with ice-cold beers.
</directions>
</recipe>

Copyright (C) 2010, http://www.dabeaz.com 3- 18

57
Element Attributes
<?xml version="1.0" encoding="iso-8859-1"?>
<recipe>
<title>Famous Guacamole</title>
<description>
A southwest favorite!
</description>
<ingredients>
<item num="2">Large avocados, chopped</item>
for item
<item in doc.findall("ingredients/item"):
num="1">Tomato, chopped</item>
num
<item = item.get("num")
num="1/2" units="C">White onion, chopped</item>
<item num="1"
units units="tbl">Fresh squeezed lemon juice</item>
= item.get("units")
<item num="1">Jalapeno pepper, diced</item>
<item num="1" units="tbl">Fresh cilantro, minced</item>
<item num="3" units="tsp">Sea Salt</item>
<item num="6" units="bottles">Ice-cold beer</item>
</ingredients>
<directions>
Combine all ingredients and hand whisk to desired consistency.
Serve and enjoy with ice-cold beers.
</directions>
</recipe>

Copyright (C) 2010, http://www.dabeaz.com 3- 19

Search Wildcards
• Specifying a wildcard for an element name
items = doc.findall("*/item")
items = doc.findall("ingredients/*")

• The * wildcard only matches a single element


• Use multiple wildcards for nesting
<?xml version="1.0"?>
<top>
<a>
<b> c = doc.findall("*/*/c")
<c>text</c> c = doc.findall("a/*/c")
</b> c = doc.findall("*/b/c")
</a>
</top>

Copyright (C) 2010, http://www.dabeaz.com 3- 20

58
Search Wildcards
• Wildcard for multiple nesting levels (//)
items = doc.findall("//item")

• More examples
<?xml version="1.0"?>
<top>
<a>
<b>
<c>text</c> c = doc.findall("//c")
</b> c = doc.findall("a//c")
</a>
</top>

Copyright (C) 2010, http://www.dabeaz.com 3- 21

cElementTree
• There is a C implementation of the library
that is significantly faster
import xml.etree.cElementTree
doc = xml.etree.cElementTree.parse("data.xml")

• For all practical purposes, you should use


this version of the library given a choice
• Note : The C version lacks a few advanced
customization features, but you probably
won't need them

Copyright (C) 2010, http://www.dabeaz.com 3- 22

59
Tree Modification
• ElementTree allows modifications to be
made to the document structure
• To add a new child to a parent node
node.append(child)

• To insert a new child at a selected position


node.insert(index,child)

• To remove a child from a parent node


node.remove(child)

Copyright (C) 2010, http://www.dabeaz.com 3- 23

Tree Output
• If you modify a document, it can be rewritten
• There is a method to write XML
doc = xml.etree.ElementTree.parse("input.xml")
# Make modifications to doc
...
# Write modified document back to a file
f = open("output.xml","w")
doc.write(f)

• Individual elements can be turned into strings


s = xml.etree.ElementTree.tostring(node)

Copyright (C) 2010, http://www.dabeaz.com 3- 24

60
Iterative Parsing
• An alternative parsing interface
from xml.etree.ElementTree import iterparse
parse = iterparse("file.xml", ('start','end'))

for event, elem in parse:


if event == 'start':
# Encountered an start <tag ...>
...
elif event == 'end':
# Encountered an end </tag>
...

• This sweeps over an entire XML document


• Result is a sequence of start/end events and
element objects being processed

Copyright (C) 2010, http://www.dabeaz.com 3- 25

Iterative Parsing
• If you combine iterative parsing and tree
modification together, you can process
large XML documents with almost no
memory overhead
• Programming interface is significantly easier
to use than a similar approach using SAX
• General idea : Simply throw away the
elements no longer needed during parsing

Copyright (C) 2010, http://www.dabeaz.com 3- 26

61
Iterative Parsing
• Programming pattern
from xml.etree.ElementTree import iterparse
parser = iterparse("file.xml",('start','end'))

for event,elem in parser:


if event == 'start':
if elem.tag == 'parenttag':
parent = elem
if event == 'end':
if elem.tag == 'tagname':
# process element with tag 'tagname'
...
# Discard the element when done
parent.remove(elem)

• The last step is the critical part


Copyright (C) 2010, http://www.dabeaz.com 3- 27

Exercise 3.2

Time : 15 Minutes

Copyright (C) 2010, http://www.dabeaz.com 3- 28

62
JSON
• Javascript Object Notation
• A data encoding commonly used on the
web when interacting with Javascript
• Sometime preferred over XML because it's
less verbose and faster to parse
• Syntax is almost identical to a Python dict

Copyright (C) 2010, http://www.dabeaz.com 3- 29

Sample JSON File


{
"recipe" : {
"title" : "Famous Guacomole",
"description" : "A southwest favorite!",
"ingredients" : [
{"num": "2", "item":"Large avocados, chopped"},
{"num": "1/2", "units":"C", "item":"White onion, chopped"},
{"num": "1", "units":"tbl", "item":"Fresh squeezed lemon juice"},
{"num": "1", "item":"Jalapeno pepper, diced"},
{"num": "1", "units":"tbl", "item":"Fresh cilantro, minced"},
{"num": "3", "units":"tsp", "item":"Sea Salt"},
{"num": "6", "units":"bottles","item":"Ice-cold beer"}
],
"directions" : "Combine all ingredients and hand whisk to desired
consistency. Serve and enjoy with ice-cold beers."
}
}

Copyright (C) 2010, http://www.dabeaz.com 3- 30

63
Processing JSON Data
• Parsing a JSON document
import json
doc = json.load(open("recipe.json"))

• Result is a collection of nested dict/lists


ingredients = doc['recipe']['ingredients']
for item in ingredients:
# Process item
...

• Dumping a dictionary as JSON


f = open("file.json","w")
json.dump(doc,f)

Copyright (C) 2010, http://www.dabeaz.com 3- 31

Exercise 3.3

Time : 15 Minutes

Copyright (C) 2010, http://www.dabeaz.com 3- 32

64
Section 4

Web Programming Basics

Introduction

• The web is (obviously) so pervasive,


knowing how to write simple web-based
applications is basic knowledge that all
programmers should know about
• In this section, we cover the absolute
basics of how to make a Python program
accessible through the web

Copyright (C) 2010, http://www.dabeaz.com 4- 2

65
Overview

• Some basics of Python web programming


• HTTP Protocol
• CGI scripting
• WSGI (Web Services Gateway Interface)
• Custom HTTP servers

Copyright (C) 2010, http://www.dabeaz.com 4- 3

Disclaimer
• Web programming is a huge topic that
could span an entire multi-day class
• It might mean different things
• Building an entire website
• Implementing a web service
• Our focus is on some basic mechanisms
found in the Python standard library that all
Python programmers should know about

Copyright (C) 2010, http://www.dabeaz.com 4- 4

66
HTTP Explained
• HTTP is the underlying protocol of the web
• Consists of requests and responses
GET /index.html

Browser 200 OK Web Server


...
<content>

Copyright (C) 2010, http://www.dabeaz.com 4- 5

HTTP Client Requests


• Client (Browser) sends a request
GET /index.html HTTP/1.1
Host: www.python.org
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.3) Gec
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/p
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
<blank line>

• Request line followed by headers that provide


additional information about the client

Copyright (C) 2010, http://www.dabeaz.com 4- 6

67
HTTP Responses
• Server sends back a response
HTTP/1.1 200 OK
Date: Thu, 26 Apr 2007 19:54:01 GMT
Server: Apache/2.0.54 (Debian GNU/Linux) DAV/2 SVN/1.1.4 mod_python/3.1.3 Pyt
Last-Modified: Thu, 26 Apr 2007 18:40:24 GMT
Accept-Ranges: bytes
Content-Length: 14315
Connection: close
Content-Type: text/html

<HTML>
...

• Response line followed by headers that


further describe the response contents

Copyright (C) 2010, http://www.dabeaz.com 4- 7

HTTP Protocol
• There are a small number of request types
GET
POST
HEAD
PUT

• There are standardized response codes


200 OK
403 Forbidden
404 Not Found
501 Not implemented
...

• But, this isn't an exhaustive tutorial


Copyright (C) 2010, http://www.dabeaz.com 4- 8

68
Content Encoding
• Content is described by these header fields:
Content-type:
Content-length:

• Example:
Content-type: image/jpeg
Content-length: 12422

• Of these, Content-type is the most critical


• Length is optional, but it's polite to include it if
it can be determined in advance

Copyright (C) 2010, http://www.dabeaz.com 4- 9

Payload Packaging
• Responses must follow this formatting
Headers
...
Content-type: image/jpeg
Content-length: 12422
...

\r\n (Blank Line)

Content
(12422 bytes)

Copyright (C) 2010, http://www.dabeaz.com 4- 10

69
Exercise 4.1

Time : 10 Minutes

Copyright (C) 2010, http://www.dabeaz.com 4- 11

Role of Python
• Most web-related Python programming
pertains to the operation of the server
GET /index.html

Firefox Web Server


Safari Apache
Internet Explorer Python
etc. MySQL
etc.

• Python scripts used on the server to create,


manage, or deliver content back to clients
Copyright (C) 2010, http://www.dabeaz.com 4- 12

70
Typical Python Tasks

• Static content generation. One-time


generation of static web pages to be served
by a standard web server such as Apache.
• Dynamic content generation. Python scripts
that produce output in response to requests
(e.g., form processing, CGI scripting).

Copyright (C) 2010, http://www.dabeaz.com 4- 13

Content Generation
• It is often overlooked, but Python is a useful
tool for simply creating static web pages
• Example : Taking various pages of content,
adding elements, and applying a common
format across all of them.
• Web server simply delivers all of the generated
content as normal files

Copyright (C) 2010, http://www.dabeaz.com 4- 14

71
Example : Page Templates
• Create a page "template" file
<html>
<body>
<table width=700>
<tr><td>
Your Logo : Navigation Links
<hr>
</td></tr>
Note the <tr><td>
special $content
$variable <hr>
<em>Copyright (C) 2008</em>
</td></tr>
</table>
</body>
</html>

Copyright (C) 2010, http://www.dabeaz.com 4- 15

Example : Page Templates


• Use template strings to render pages
from string import Template

# Read the template string


pagetemplate = Template(open("template.html").read())

# Go make content
page = make_content()

# Render the template to a file


f = open(outfile,"w")
f.write(pagetemplate.substitute(content=page))

• Key idea : If you want to change the


appearance, you just change the template

Copyright (C) 2010, http://www.dabeaz.com 4- 16

72
Commentary

• Using page templates to generate static


content is extremely common
• For simple things, just use the standard library
modules (e.g., string.Template)
• For more advanced applications, there are
numerous third-party template packages

Copyright (C) 2010, http://www.dabeaz.com 4- 17

Exercise 4.2

Time : 10 Minutes

Copyright (C) 2010, http://www.dabeaz.com 4- 18

73
HTTP Servers
• Python comes with libraries that implement
simple self-contained web servers
• Very useful for testing or special situations
where you want web service, but don't want
to install something larger (e.g., Apache)
• Not high performance, sometimes "good
enough" is just that

Copyright (C) 2010, http://www.dabeaz.com 4- 19

A Simple Web Server


• Serve files from a directory
from BaseHTTPServer import HTTPServer
from SimpleHTTPServer import SimpleHTTPRequestHandler
import os
os.chdir("/home/docs/html")
serv = HTTPServer(("",8080),SimpleHTTPRequestHandler)
serv.serve_forever()

• This creates a minimal web server


• Connect with a browser and try it out

Copyright (C) 2010, http://www.dabeaz.com 4- 20

74
Exercise 4.3

Time : 10 Minutes

Copyright (C) 2010, http://www.dabeaz.com 4- 21

A Web Server with CGI


• Serve files and allow CGI scripts
from BaseHTTPServer import HTTPServer
from CGIHTTPServer import CGIHTTPRequestHandler
import os
os.chdir("/home/docs/html")
serv = HTTPServer(("",8080),CGIHTTPRequestHandler)
serv.serve_forever()

• Executes scripts in "/cgi-bin" and "/htbin"


directories in order to create dynamic content

Copyright (C) 2010, http://www.dabeaz.com 4- 22

75
CGI Scripting
• Common Gateway Interface
• A common protocol used by existing web
servers to run server-side scripts, plugins
• Example: Running Python, Perl, Ruby scripts
under Apache, etc.
• Classically associated with form processing,
but that's far from the only application

Copyright (C) 2010, http://www.dabeaz.com 4- 23

CGI Example
• A web-page might have a form on it

• Here is the underlying HTML code


<FORM ACTION="/cgi-bin/subscribe.py" METHOD="POST">
Your name: <INPUT type="text" name="name" size="30"><br>
Your email: <INPUT type="text" name="email" size="30"><br>
<INPUT type="submit" name="submit-button" value="Subscribe">

Specifies a CGI program on the server

Copyright (C) 2010, http://www.dabeaz.com 4- 24

76
CGI Example
• Forms have submitted fields or parameters
<FORM ACTION="/cgi-bin/subscribe.py" METHOD="POST">
Your name: <INPUT type="text" name="name" size="30"><br>
Your email: <INPUT type="text" name="email" size="30"><br>
<INPUT type="submit" name="submit-button" value="Subscribe">

• A request will include both the URL (cgi-bin/


subscribe.py) along with the field values

Copyright (C) 2010, http://www.dabeaz.com 4- 25

CGI Example
• Request encoding looks like this:
Request POST /cgi-bin/subscribe.py HTTP/1.1
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS
Accept: text/xml,application/xml,application/xhtml
Accept-Language: en-us,en;q=0.5
...

Query name=David+Beazley&email=dave%40dabeaz.com&submit-
String button=Subscribe HTTP/1.1

• Request tells the server what to run


• Query string contains encoded form fields
Copyright (C) 2010, http://www.dabeaz.com 4- 26

77
CGI Mechanics
• CGI was originally implemented as a scheme
for launching processing scripts as a subprocess
to a web server
/cgi-bin/subscribe.py

HTTP Server

• Script will decode the


stdin stdout

Python
request and carry out
subscribe.py
some kind of action

Copyright (C) 2010, http://www.dabeaz.com 4- 27

Classic CGI Interface


• Server populates environment variables with
information about the request
import os
os.environ['SCRIPT_NAME']
os.environ['REMOTE_ADDR']
os.environ['QUERY_STRING']
os.environ['REQUEST_METHOD']
os.environ['CONTENT_TYPE']
os.environ['CONTENT_LENGTH']
os.environ['HTTP_COOKIE']
...

• stdin/stdout provide I/O link to server


sys.stdin # Read to get data sent by client
sys.stdout # Write to create the response

Copyright (C) 2010, http://www.dabeaz.com 4- 28

78
CGI Query Variables
• For GET requests, an env. variable is used
query = os.environ['QUERY_STRING']

• For POST requests, you read from stdin


if os.environ['REQUEST_METHOD'] == 'POST':
size = int(os.environ['CONTENT_LENGTH'])
query = sys.stdin.read(size)

• This yields the raw query string


name=David+Beazley&email=dave
%40dabeaz.com&submit-button=Subscribe

Copyright (C) 2010, http://www.dabeaz.com 4- 29

cgi Module
• A utility library for decoding requests
• Major feature: Getting the passed parameters
#!/usr/bin/env python
# subscribe.py
import cgi
form = cgi.FieldStorage() Parse parameters
# Get various field values
name = form.getvalue('name')
email = form.getvalue('email')

• All CGI scripts start like this


• FieldStorage parses the incoming request into
a dictionary-like object for extracting inputs
Copyright (C) 2010, http://www.dabeaz.com 4- 30

79
CGI Responses
• CGI scripts respond by simply printing
response headers and the raw content
name = form.getvalue('name')
email = form.getvalue('email')
... do some kind of processing ...

# Output a response
print "Status: 200 OK"
print "Content-type: text/html"
print
print "<html><head><title>Success!</title></head><body>"
print "Hello %s, your email is %s" % (name,email)
print "</body>"

• Normally you print HTML, but any kind of


data can be returned (for web services, you
might return XML, JSON, etc.)
Copyright (C) 2010, http://www.dabeaz.com 4- 31

Note on Status Codes


• In CGI, the server status code is set by
including a special "Status:" header field
import cgi
form = cgi.FieldStorage()
name = form.getvalue('name')
email = form.getvalue('email')
...
print "Status: 200 OK"
print "Content-type: text/html"
print
print "<html><head><title>Success!</title></head><body>"
print "Hello %s, your email is %s" % (name,email)
print "</body>"

• This is a special server directive that sets the


response status
Copyright (C) 2010, http://www.dabeaz.com 4- 32

80
CGI Commentary
• There are many more minor details (consult
a reference on CGI programming)
• The basic idea is simple
• Server runs a script
• Script receives inputs from
environment variables and stdin
• Script produces output on stdout
• It's old-school, but sometimes it's all you get
Copyright (C) 2010, http://www.dabeaz.com 4- 33

Exercise 4.4

Time : 25 Minutes

Copyright (C) 2010, http://www.dabeaz.com 4- 34

81
WSGI
• Web Services Gateway Interface (WSGI)
• This is a standardized interface for creating
Python web services
• Allows one to create code that can run under a
wide variety of web servers and frameworks as
long as they also support WSGI (and most do)
• So, what is WSGI?
Copyright (C) 2010, http://www.dabeaz.com 4- 35

WSGI Interface
• WSGI is an application programming interface
loosely based on CGI programming
• In CGI, there are just two basic features
• Getting values of inputs (env variables)
• Producing output by printing
• WSGI takes this concept and repackages it into
a more modular form

Copyright (C) 2010, http://www.dabeaz.com 4- 36

82
WSGI Example
• With WSGI, you write an "application"
• An application is just a function (or callable)
def hello_app(environ, start_response):
status = "200 OK"
response_headers = [ ('Content-type','text/plain')]
response = []

start_response(status,response_headers)
response.append("Hello World\n")
response.append("You requested :"+environ['PATH_INFO]')
return response

• This function encapsulates the handling of some


request that will be received
Copyright (C) 2010, http://www.dabeaz.com 4- 37

WSGI Applications
• Applications always receive just two inputs
def hello_app(environ, start_response):
status = "200 OK"
response_headers = [ ('Content-type','text/plain')]
response = []

start_response(status,response_headers)
response.append("Hello World\n")
response.append("You requested :"+environ['PATH_INFO]')
return response

• environ - A dictionary of input parameters


• start_response - A callable (e.g., function)
Copyright (C) 2010, http://www.dabeaz.com 4- 38

83
WSGI Environment
• The environment contains CGI variables
def hello_app(environ, start_response):
status = "200 OK"
response_headers = [ ('Content-type','text/plain')]
response = []
environ['REQUEST_METHOD']
environ['SCRIPT_NAME']
start_response(status,response_headers)
environ['PATH_INFO']
response.append("Hello World\n")
environ['QUERY_STRING']
response.append("You requested :"+environ['PATH_INFO]')
environ['CONTENT_TYPE']
return response
environ['CONTENT_LENGTH']
environ['SERVER_NAME']
...

• The meaning and values are exactly the same as


in traditional CGI programs
Copyright (C) 2010, http://www.dabeaz.com 4- 39

WSGI Environment
• Environment also contains some WSGI variables
def hello_app(environ, start_response):
status = "200 OK"
response_headers = [ ('Content-type','text/plain')]
response = []
environ['wsgi.input']
environ['wsgi.errors']
start_response(status,response_headers)
environ['wsgi.url_scheme']
response.append("Hello World\n")
environ['wsgi.multithread']
response.append("You requested :"+environ['PATH_INFO]')
environ['wsgi.multiprocess']
return response
...

• wsgi.input - A file-like object for reading data


• wsgi.errors - File-like object for error output
Copyright (C) 2010, http://www.dabeaz.com 4- 40

84
Processing WSGI Inputs
• Parsing of query strings is similar to CGI
import cgi
def sample_app(environ,start_response):
fields = cgi.FieldStorage(environ['wsgi.input'],
environ=environ)
# fields now has the CGI query variables
...

• You use FieldStorage() as before, but give it


extra parameters telling it where to get data

Copyright (C) 2010, http://www.dabeaz.com 4- 41

WSGI Responses
• The second argument is a function that is called
to initiate a response
def hello_app(environ, start_response):
status = "200 OK"
response_headers = [ ('Content-type','text/plain')]
response = []

start_response(status,response_headers)
response.append("Hello World\n")
response.append("You requested :"+environ['PATH_INFO]')
return response

• You pass it two parameters


• A status string (e.g., "200 OK")
• A list of (header, value) HTTP header pairs
Copyright (C) 2010, http://www.dabeaz.com 4- 42

85
WSGI Responses

• start_response() is a hook back to the server


• Gives the server information for formulating
the response (status, headers, etc.)
• Prepares the server for receiving content data

Copyright (C) 2010, http://www.dabeaz.com 4- 43

WSGI Content
• Content is returned as a sequence of byte strings
def hello_app(environ, start_response):
status = "200 OK"
response_headers = [ ('Content-type','text/plain')]
response = []

start_response(status,response_headers)
response.append("Hello World\n")
response.append("You requested :"+environ['PATH_INFO]')
return response

• Note: This differs from CGI programming


where you produce output using print.

Copyright (C) 2010, http://www.dabeaz.com 4- 44

86
WSGI Content Encoding
• WSGI applications must always produce bytes
• If working with Unicode, it must be encoded
def hello_app(environ, start_response):
status = "200 OK"
response_headers = [ ('Content-type','text/html')]

start_response(status,response_headers)
return [u"That's a spicy Jalape\u00f1o".encode('utf-8')]

• This is a little tricky--if you're not anticipating


Unicode, everything can break if a Unicode
string is returned (be aware that certain
modules such as database modules may do this)
Copyright (C) 2010, http://www.dabeaz.com 4- 45

WSGI Deployment
• The main point of WSGI is to simplify
deployment of web applications
• You will notice that the interface depends on
no third party libraries, no objects, or even any
standard library modules
• That is intentional. WSGI apps are supposed to
be small self-contained units that plug into
other environments

Copyright (C) 2010, http://www.dabeaz.com 4- 46

87
WSGI Deployment
• Running a simple stand-alone WSGI server
from wsgiref import simple_server
httpd = simple_server.make_server("",8080,hello_app)
httpd.serve_forever()

• This runs an HTTP server for testing


• You probably wouldn't deploy anything using
this, but if you're developing code on your own
machine, it can be useful

Copyright (C) 2010, http://www.dabeaz.com 4- 47

WSGI and CGI


• WSGI applications can run on top of standard
CGI scripting (which is useful if you're
interfacing with traditional web servers).
#!/usr/bin/env python
# hello.py

def hello_app(environ,start_response):
...

import wsgiref.handlers
wsgiref.handlers.CGIHandler().run(hello_app)

Copyright (C) 2010, http://www.dabeaz.com 4- 48

88
Exercise 4.5

Time : 20 Minutes

Copyright (C) 2010, http://www.dabeaz.com 4- 49

Customized HTTP

• Can implement customized HTTP servers


• Use BaseHTTPServer module
• Define a customized HTTP handler object
• Requires some knowledge of the underlying
HTTP protocol

Copyright (C) 2010, http://www.dabeaz.com 4- 50

89
Customized HTTP
• Example: A Hello World Server
from BaseHTTPServer import BaseHTTPRequestHandler,HTTPServer

class HelloHandler(BaseHTTPRequestHandler):
def do_GET(self):
if self.path == '/hello':
self.send_response(200,"OK")
self.send_header('Content-type','text/plain')
self.end_headers()
self.wfile.write("""<HTML>
<HEAD><TITLE>Hello</TITLE></HEAD>
<BODY>Hello World!</BODY></HTML>""")

serv = HTTPServer(("",8080),HelloHandler)
serv.serve_forever()

• Defined a method for "GET" requests


Copyright (C) 2010, http://www.dabeaz.com 4- 51

Customized HTTP
• A more complex server
from BaseHTTPServer import BaseHTTPRequestHandler,HTTPServer

class MyHandler(BaseHTTPRequestHandler):
def do_GET(self):
...
def do_POST(self): Redefine the behavior of the
... server by defining code for
def do_HEAD(self): all of the standard HTTP
... request types
def do_PUT(self):
...

serv = HTTPServer(("",8080),MyHandler)
serv.serve_forever()

• Can customize everything (requires work)


Copyright (C) 2010, http://www.dabeaz.com 4- 52

90
Exercise 4.6

Time : 15 Minutes

Copyright (C) 2010, http://www.dabeaz.com 4- 53

Web Frameworks
• Python has a huge number of web frameworks
• Zope
• Django
• Turbogears
• Pylons
• CherryPy
• Google App Engine
• Frankly, there are too many to list here..
Copyright (C) 2010, http://www.dabeaz.com 4- 54

91
Web Frameworks
• Web frameworks build upon previous concepts
• Provide additional support for
• Form processing
• Cookies/sessions
• Database integration
• Content management
• Usually require their own training course
Copyright (C) 2010, http://www.dabeaz.com 4- 55

Commentary
• If you're building small self-contained
components or middleware for use on the
web, you're probably better off with WSGI
• The programming interface is minimal
• The components you create will be self-
contained if you're careful with your design
• Since WSGI is an official part of Python,
virtually all web frameworks will support it

Copyright (C) 2010, http://www.dabeaz.com 4- 56

92
Section 5

Advanced Networking

Overview

• An assortment of advanced networking topics


• The Python network programming stack
• Concurrent servers
• Distributed computing
• Multiprocessing

Copyright (C) 2010, http://www.dabeaz.com 5- 2

93
Problem with Sockets

• In part 1, we looked at low-level programming


with sockets
• Although it is possible to write applications
based on that interface, most of Python's
network libraries use a higher level interface
• For servers, there's the SocketServer module

Copyright (C) 2010, http://www.dabeaz.com 5- 3

SocketServer

• A module for writing custom servers


• Supports TCP and UDP networking
• The module aims to simplify some of the
low-level details of working with sockets and
put to all of that functionality in one place

Copyright (C) 2010, http://www.dabeaz.com 5- 4

94
SocketServer Example
• To use SocketServer, you define handler
objects using classes
• Example: A time server
import SocketServer
import time

class TimeHandler(SocketServer.BaseRequestHandler):
def handle(self):
self.request.sendall(time.ctime()+"\n")

serv = SocketServer.TCPServer(("",8000),TimeHandler)
serv.serve_forever()

Copyright (C) 2010, http://www.dabeaz.com 5- 5

SocketServer Example
• Handler Class
import SocketServer Server is implemented
import time by a handler class

class TimeHandler(SocketServer.BaseRequestHandler):
def handle(self):
self.request.sendall(time.ctime()+"\n")

serv = SocketServer.TCPServer(("",8000),TimeHandler)
serv.serve_forever()

Copyright (C) 2010, http://www.dabeaz.com 5- 6

95
SocketServer Example
• Handler Class Must inherit from
import SocketServer BaseRequestHandler
import time

class TimeHandler(SocketServer.BaseRequestHandler):
def handle(self):
self.request.sendall(time.ctime())

serv = SocketServer.TCPServer(("",8000),TimeHandler)
serv.serve_forever()

Copyright (C) 2010, http://www.dabeaz.com 5- 7

SocketServer Example
• handle() method
import SocketServer
Define handle()
import time
to implement the
server action
class TimeHandler(SocketServer.BaseRequestHandler):
def handle(self):
self.request.sendall(time.ctime())

serv = SocketServer.TCPServer(("",8000),TimeHandler)
serv.serve_forever()

Copyright (C) 2010, http://www.dabeaz.com 5- 8

96
SocketServer Example
• Client socket connection
import SocketServer
import time

class TimeHandler(SocketServer.BaseRequestHandler):
def handle(self):
self.request.sendall(time.ctime())

serv = SocketServer.TCPServer(("",8000),TimeHandler)
Socket object
serv.serve_forever()
for client connection

• This is a bare socket object


Copyright (C) 2010, http://www.dabeaz.com 5- 9

SocketServer Example
• Creating and running the server
import SocketServer
import time

Creates a server and


class TimeHandler(SocketServer.BaseRequestHandler):
def handle(self): connects a handler
self.request.sendall(time.ctime())

serv = SocketServer.TCPServer(("",8000),TimeHandler)
serv.serve_forever()
Runs the server
forever

Copyright (C) 2010, http://www.dabeaz.com 5- 10

97
Execution Model
• Server runs in a loop waiting for requests
• On each connection, the server creates a
new instantiation of the handler class
• The handle() method is invoked to handle
the logic of communicating with the client
• When handle() returns, the connection is
closed and the handler instance is destroyed

Copyright (C) 2010, http://www.dabeaz.com 5- 11

Exercise 5.1

Time : 15 Minutes

Copyright (C) 2010, http://www.dabeaz.com 5- 12

98
Big Picture
• A major goal of SocketServer is to simplify
the task of plugging different server handler
objects into different kinds of server
implementations
• For example, servers with different
implementations of concurrency, extra
security features, etc.

Copyright (C) 2010, http://www.dabeaz.com 5- 13

Concurrent Servers
• SocketServer supports different kinds of
concurrency implementations
TCPServer - Synchronous TCP server (one client)
ForkingTCPServer - Forking server (multiple clients)
ThreadingTCPServer - Threaded server (multiple clients)

• Just pick the server that you want and plug


the handler object into it
serv = SocketServer.ForkingTCPServer(("",8000),TimeHandler)
serv.serve_forever()

serv = SocketServer.ThreadingTCPServer(("",8000),TimeHandler)
serv.serve_forever()

Copyright (C) 2010, http://www.dabeaz.com 5- 14

99
Server Mixin Classes
• SocketServer defines these mixin classes
ForkingMixIn
ThreadingMixIn

• These can be used to add concurrency to


other server objects (via multiple inheritance)
from BaseHTTPServer import HTTPServer
from SimpleHTTPServer import SimpleHTTPRequestHandler
from SocketServer import ThreadingMixIn

class ThreadedHTTPServer(ThreadingMixIn, HTTPServer):


pass

serv = ThreadedHTTPServer(("",8080),
SimpleHTTPRequestHandler)

Copyright (C) 2010, http://www.dabeaz.com 5- 15

Server Subclassing
• SocketServer objects are also subclassed to
provide additional customization
• Example: Security/Firewalls
class RestrictedTCPServer(TCPServer):
# Restrict connections to loopback interface
def verify_request(self,request,addr):
host, port = addr
if host != '127.0.0.1':
return False
else:
return True

serv = RestrictedTCPServer(("",8080),TimeHandler)
serv.serve_forever()

Copyright (C) 2010, http://www.dabeaz.com 5- 16

100
Exercise 5.2

Time : 15 Minutes

Copyright (C) 2010, http://www.dabeaz.com 5- 17

Distributed Computing
• It is relatively simple to build Python
applications that span multiple machines or
operate on clusters

Copyright (C) 2010, http://www.dabeaz.com 5- 18

101
Discussion
• Keep in mind: Python is a "slow" interpreted
programming language
• So, we're not necessarily talking about high
performance computing in Python (e.g.,
number crunching, etc.)
• However, Python can serve as a very useful
distributed scripting environment for
controlling things on different systems

Copyright (C) 2010, http://www.dabeaz.com 5- 19

XML-RPC

• Remote Procedure Call


• Uses HTTP as a transport protocol
• Parameters/Results encoded in XML
• Supported by languages other than Python

Copyright (C) 2010, http://www.dabeaz.com 5- 20

102
Simple XML-RPC
• How to create a stand-alone server
from SimpleXMLRPCServer import SimpleXMLRPCServer

def add(x,y):
return x+y

s = SimpleXMLRPCServer(("",8080))
s.register_function(add)
s.serve_forever()

• How to test it (xmlrpclib)


>>> import xmlrpclib
>>> s = xmlrpclib.ServerProxy("http://localhost:8080")
>>> s.add(3,5)
8
>>> s.add("Hello","World")
"HelloWorld"
>>>

Copyright (C) 2010, http://www.dabeaz.com 5- 21

Simple XML-RPC
• Adding multiple functions
from SimpleXMLRPCServer import SimpleXMLRPCServer

s = SimpleXMLRPCServer(("",8080))
s.register_function(add)
s.register_function(foo)
s.register_function(bar)
s.serve_forever()

• Registering an instance (exposes all methods)


from SimpleXMLRPCServer import SimpleXMLRPCServer

s = SimpleXMLRPCServer(("",8080))
obj = SomeObject()
s.register_instance(obj)
s.serve_forever()

Copyright (C) 2010, http://www.dabeaz.com 5- 22

103
XML-RPC Commentary

• XML-RPC is extremely easy to use


• Almost too easy--you might get the perception
that it's extremely limited or fragile
• I have encountered a lot of major projects that
are using XML-RPC for distributed control
• Users seem to love it (I concur)

Copyright (C) 2010, http://www.dabeaz.com 5- 23

XML-RPC and Binary


• One wart of caution...
• XML-RPC assumes all strings are UTF-8
encoded Unicode
• Consequence:You can't shove a string of raw
binary data through an XML-RPC call
• For binary: must base64 encode/decode
• base64 module can be used for this
Copyright (C) 2010, http://www.dabeaz.com 5- 24

104
Exercise 5.3

Time : 15 Minutes

Copyright (C) 2010, http://www.dabeaz.com 5- 25

Serializing Python Objects


• In distributed applications, you may want to
pass various kinds of Python objects around
(e.g., lists, dicts, sets, instances, etc.)
• Libraries such as XML-RPC support simple
data types, but not anything more complex
• However, serializing arbitrary Python objects
into byte-strings is quite simple

Copyright (C) 2010, http://www.dabeaz.com 5- 26

105
pickle Module
• A module for serializing objects
• Serializing an object onto a "file"
import pickle
...
pickle.dump(someobj,f)

• Unserializing an object from a file


someobj = pickle.load(f)

• Here, a file might be a file, a pipe, a wrapper


around a socket, etc.

Copyright (C) 2010, http://www.dabeaz.com 5- 27

Pickling to Strings
• Pickle can also turn objects into byte strings
import pickle
# Convert to a string
s = pickle.dumps(someobj, protocol)
...
# Load from a string
someobj = pickle.loads(s)

• This can be used if you need to embed a


Python object into some other messaging
protocol or data encoding

Copyright (C) 2010, http://www.dabeaz.com 5- 28

106
Example
• Using pickle with XML-RPC
# addserv.py
import pickle

def add(px,py):
x = pickle.loads(px)
y = pickle.loads(py)
return pickle.dumps(x+y)

from SimpleXMLRPCServer import SimpleXMLRPCServer


serv = SimpleXMLRPCServer(("",15000))
serv.register_function(add)
serv.serve_forever()

• Notice: All input arguments and return values


are encoded/decoded with pickle
Copyright (C) 2010, http://www.dabeaz.com 5- 29

Example
• Passing Python objects from the client
>>> import pickle
>>> import xmlrpclib
>>> serv = xmlrpclib.ServerProxy("http://localhost:15000")
>>> a = [1,2,3]
>>> b = [4,5]
>>> r = serv.add(pickle.dumps(a),pickle.dumps(b))
>>> c = pickle.loads(r)
>>> c
[1, 2, 3, 4, 5]
>>>

• Again, all input and return values are processed


through pickle

Copyright (C) 2010, http://www.dabeaz.com 5- 30

107
Miscellaneous Comments
• Pickle is really only useful if used in a Python-
only environment
• Would not use if you need to communicate
to other programming languages
• There are also security concerns
• Never use pickle with untrusted clients
(malformed pickles can be used to execute
arbitrary system commands)

Copyright (C) 2010, http://www.dabeaz.com 5- 31

Exercise 5.4

Time : 15 Minutes

Copyright (C) 2010, http://www.dabeaz.com 5- 32

108
multiprocessing
• Python 2.6/3.0 include a new library module
(multiprocessing) that can be used for
different forms of distributed computation
• It is a substantial module that also addresses
interprocess communication, parallel
computing, worker pools, etc.
• Will only show a few network features here

Copyright (C) 2010, http://www.dabeaz.com 5- 33

Connections
• Creating a dedicated connection between
two Python interpreter processes
• Listener (server) process
from multiprocessing.connection import Listener
serv = Listener(("",16000),authkey="12345")
c = serv.accept()

• Client process
from multiprocessing.connection import Client
c = Client(("servername",16000),authkey="12345")

• On surface, looks similar to a TCP connection


Copyright (C) 2010, http://www.dabeaz.com 5- 34

109
Connection Use
• Connections allow bidirectional message
passing of arbitrary Python objects

c.send(obj) obj = c.recv()

• Underneath the covers, everything routes


through the pickle module
• Similar to a network connection except that
you just pass objects through it
Copyright (C) 2010, http://www.dabeaz.com 5- 35

Example
• Example server using multiprocessing
# addserv.py

def add(x,y):
return x+y

from multiprocessing.connection import Listener


serv = Listener(("",16000),authkey="12345")
c = serv.accept()
while True:
x,y = c.recv() # Receive a pair
c.send(add(x,y)) # Send result of add(x,y)

• Note: Omitting a variety of error checking/


exception handling

Copyright (C) 2010, http://www.dabeaz.com 5- 36

110
Example
• Client connection with multiprocessing
>>> from multiprocessing.connection import Client
>>> client = Client(("",16000),authkey="12345")
>>> a = [1,2,3]
>>> b = [4,5]
>>> client.send((a,b))
>>> c = client.recv()
>>> c
[1, 2, 3, 4, 5]
>>>

• Even though pickle is being used underneath


the covers, you don't see it here

Copyright (C) 2010, http://www.dabeaz.com 5- 37

Commentary
• Multiprocessing module already does the
work related to pickling, error handling, etc.
• Can use it as the foundation for something
more advanced
• There are many more features of
multiprocessing not shown here (e.g.,
features related to distributed objects,
parallel processing, etc.)

Copyright (C) 2010, http://www.dabeaz.com 5- 38

111
Commentary

• Multiprocessing is a good choice if you're


working strictly in a Python environment
• It will be faster than XML-RPC
• It has some security features (authkey)
• More flexible support for passing Python
objects around

Copyright (C) 2010, http://www.dabeaz.com 5- 39

What about...
• CORBA? SOAP? Others?
• There are third party libraries for this
• Honestly, most Python programmers aren't
into big heavyweight distributed object
systems like this (too much trauma)
• However, if you're into distributed objects,
you should probably look at the Pyro project
(http://pyro.sourceforge.net)

Copyright (C) 2010, http://www.dabeaz.com 5- 40

112
Network Wrap-up
• Have covered the basics of network support
that's bundled with Python (standard lib)
• Possible directions from here...
• Concurrent programming techniques
(often needed for server implementation)
• Parallel computing (scientific computing)
• Web frameworks
Copyright (C) 2010, http://www.dabeaz.com 5- 41

Exercise 5.5

Time : 15 Minutes

Copyright (C) 2010, http://www.dabeaz.com 5- 42

113
Python Network Programming Index Django, 4-54
dump() function, pickle module, 5-27
dumps() function, pickle module, 5-28
A
E
accept() method, of sockets, 1-19, 1-22
Address binding, TCP server, 1-20 ElementTree module, modifying document
Addressing, network, 1-4 structure, 3-23
Asynchronous network server, 1-52 ElementTree module, performance, 3-22
ElementTree module, xml.etree package, 3-14
B ElementTree, attributes, 3-19
ElementTree, incremental XML parsing, 3-25
BaseRequestHandler, SocketServer module, 5-5 ElementTree, wildcards, 3-20
bind() method, of sockets, 1-19, 1-20, 1-42 ElementTree, writing XML, 3-24
Browser, emulating in HTTP requests, 2-21 End of file, of sockets, 1-32
build_opener() function, urllib2 module, 2-24 environ variable, os module, 4-28
Error handling, HTTP requests, 2-22
C
F
cElementTree module, 3-22
cgi module, 4-30 FieldStorage object, cgi module, 4-30
CGI scripting, 4-23, 4-24, 4-25, 4-26, 4-27 File upload, via urllib, 2-28
CGI scripting, and WSGI, 4-48 Files, creating from a socket, 1-37
CGI scripting, creating a response, 4-31, 4-32 Forking server, 1-51
CGI scripting, environment variables, 4-28 ForkingMixIn class, SocketServer module, 5-15
CGI scripting, I/O model, 4-28 ForkingTCPServer, SocketServer module, 5-14
CGI scripting, parsing query variables, 4-30 ForkingUDPServer, SocketServer module, 5-14
CGI scripting, query string, 4-26 Form data, posting in an HTTP request, 2-10,
CGI scripting, query variables, 4-29 2-11, 2-20
CherryPy, 4-54 FTP server, interacting with, 2-29
Client objects, multiprocessing module, 5-34 FTP, uploading files to a server, 2-30
Client/Server programming, 1-8 ftplib module, 2-29
close() method, of sockets, 1-16, 1-25
Concurrency, and socket programming, 1-46 G
connect() method, of sockets, 1-16
Connections, network, 1-7 gethostbyaddr() function, socket module, 1-53
Content encoding, HTTP responses, 4-9 gethostbyname() function, socket module, 1-53
Cookie handling and HTTP requests, 2-25 gethostname() function, socket module, 1-53
Cookies, and urllib2 module, 2-17 Google AppEngine, 4-54
CORBA, 5-40
Creating custom openers for HTTP requests, 2-24
csv module, 3-3 H

D Hostname, 1-4
Hostname, obtaining, 1-53
HTML, parsing of, 3-4, 3-7
Datagram, 1-43 HTMLParser module, 3-5, 3-7
Distributed computing, 5-18, 5-19
HTTP cookies, 2-25 O
HTTP protocol, 4-5
HTTP request, with cookie handling, 2-25
HTTP status code, obtaining with urllib, 2-14 Objects, serialization of, 5-26
HTTP, client-side protocol, 2-31 Opener objects, urllib2 module, 2-23
HTTP, methods, 4-8 OpenSSL, 2-5
HTTP, request structure, 4-6
HTTP, response codes, 4-8 P
HTTP, response content encoding, 4-9
HTTP, response structure, 4-7, 4-10, 4-12
Parsing HTML, 3-7
httplib module, 2-31
Parsing, JSON, 3-29
Parsing, of HTML, 3-5
I pickle module, 5-27
POST method, of HTTP requests, 2-6, 2-7
Interprocess communication, 1-44 Posting form data, HTTP requests, 2-10, 2-11,
IP address, 1-4 2-20
IPC, 1-44 Pylons, 4-54
IPv4 socket, 1-13
IPv6 socket, 1-13 Q

J Query string, and CGI scripting, 4-26

JSON, 3-29 R
json module, 3-31
Raw Sockets, 1-45
L recv() method, of sockets, 1-16
recvfrom() method, of sockets, 1-42, 1-43
Limitations, of urllib module, 2-28 Request objects, urllib2 module, 2-19
listen() method, of sockets, 1-19, 1-21 Request-response cycle, network programming,
Listener objects, multiprocessing module, 5-34 1-9
load() function, pickle module, 5-27 RFC-2822 headers, 4-6
loads() function, pickle module, 5-28
S
M
sax module, xml package, 3-11
makefile() method, of sockets, 1-37 select module, 1-52
multiprocessing module, 5-33 select() function, select module, 1-52
send() method, of sockets, 1-16, 1-24
sendall() method, of sockets, 1-31
N Sending email, 2-32
sendto() method, of sockets, 1-42, 1-43
netstat, 1-6 Serialization, of Python objects, 5-26
Network addresses, 1-4, 1-7 serve_forever() method, SocketServer, 5-5
Network programming, client-server concept, 1-8 setsockopt() method, of sockets, 1-36
Network programming, standard port settimeout() method, of sockets, 1-34
assignments, 1-5 SimpleXMLRPCServer module, 5-21
simple_server module, wsgiref package, 4-46, UDPServer, SocketServer module, 5-14
4-47 Unix domain sockets, 1-44
smtplib module, 2-32 Uploading files, to an FTP server, 2-30
SOAP, 5-40 URL, parameter encoding, 2-6, 2-7
socket module, 1-13 urlencode() function, urllib module, 2-9
socket() function, socket module, 1-13 urllib module, 2-3
Socket, using for server or client, 1-15 urllib module, limitations, 2-28
Socket, wrapping with a file object, 1-37 urllib2 module, 2-17
Sockets, 1-12, 1-13 urllib2 module, error handling, 2-22
Sockets, and concurrency, 1-46 urllib2 module, Request objects, 2-19
Sockets, asynchronous server, 1-52 urlopen() function, obtaining response headers,
Sockets, end of file indication, 1-32 2-13
Sockets, forking server example, 1-51 urlopen() function, obtaining status code, 2-14
Sockets, partial reads and writes, 1-29 urlopen() function, reading responses, 2-12
Sockets, setting a timeout, 1-34 urlopen() function, urllib module, 2-4
Sockets, setting options, 1-36 urlopen() function, urllib2 module, 2-18
Sockets, threaded server, 1-50 urlopen(), posting form data, 2-10, 2-11, 2-20
SocketServer module, 5-4 urlopen(), supported protocols, 2-5
SocketServer, subclassing, 5-16 User-agent, setting in HTTP requests, 2-21
Standard port assignments, 1-5
V
T
viewing open network connections, 1-6
TCP, 1-13, 1-14
TCP, accepting new connections, 1-22 W
TCP, address binding, 1-20
TCP, client example, 1-16
TCP, communication with client, 1-23 Web frameworks, 4-54, 4-55
TCP, example with SocketServer module, 5-5 Web programming, and WSGI, 4-35, 4-36
TCP, listening for connections, 1-21 Web programming, CGI scripting, 4-23, 4-24,
TCP, server example, 1-19 4-25, 4-26, 4-27
TCPServer, SocketServer module, 5-10 Web services, 2-8
Telnet, using with network applications, 1-10 Webdav, 2-28
Threaded network server, 1-50 WSGI, 4-36
ThreadingMixIn class, SocketServer module, WSGI (Web Services Gateway Interface), 4-35
5-15 WSGI, and CGI environment variables, 4-39
ThreadingTCPServer, SocketServer module, 5-14 WSGI, and wsgi.* variables, 4-40
ThreadingUDPServer, SocketServer module, 5-14 WSGI, application inputs, 4-38
Threads, and network servers, 1-50 WSGI, applications, 4-37
Timeout, on sockets, 1-34 WSGI, parsing query string, 4-41
Turbogears, 4-54 WSGI, producing content, 4-44
Twisted framework, 1-52 WSGI, response encoding, 4-45
WSGI, responses, 4-42
WSGI, running a stand-alone server, 4-46, 4-47
U WSGI, running applications within a CGI script,
4-48
UDP, 1-13, 1-41 WWW, see HTTP, 4-5
UDP, client example, 1-43
UDP, server example, 1-42
X

XML, element attributes, 3-19


XML, element wildcards, 3-20
XML, ElementTree interface, 3-15, 3-16
XML, ElementTree module, 3-14
XML, finding all matching elements, 3-18
XML, finding matching elements, 3-17
XML, incremental parsing of, 3-25
XML, modifying documentation structu with
ElementTree, 3-23
XML, parsing with SAX, 3-9
XML, writing to files, 3-24
XML-RPC, 5-20

Zope, 4-54

You might also like