0% found this document useful (0 votes)

105 views103 pages

SENG419-python 98745

This document provides an introduction to a course on data science. It discusses the growing demand for data scientists and explains what data science is. The course will provide an overview of key data science topics through a mixture of theory and practical applications using Python. Grading will be based on assignments, exams, and a final project. The document also introduces Python programming and explains how to set up an Anaconda environment for the coursework.

Uploaded by

muhmmad sheeraz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

105 views103 pages

SENG419-python 98745

Uploaded by

muhmmad sheeraz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 103

Introduction to Data Science

Couse intro & Python tutorial

Bilal Shoaib Khan

Contact for the course
• Instructor: Dr. Bilal Shoiab Khan
– [email protected]
Plan for this lecture
• Data Science - why all the excitement
• What is data science
• Course information – syllabus, grading, etc.
• Basic Python programming
Data Scientists are in high demand
Also in academia
Pays Well
Demand will outpace supply
Data Scientist Job Trend in last 3 years
Job postings Jobseeker interest
0.151% 0.074%

Source: indeed.com
Data Science: Why all the Excitement?
e.g.,
Google Flu Trends:

Detecting outbreaks
two weeks ahead
of CDC data

New models are estimating

which cities are most at risk
for spread of the Ebola virus.

9
Why the all the Excitement?

10
The unreasonable effectiveness of Deep
Learning (CNNs)
2012 Imagenet challenge:
Classify 1 million images into 1000 classes.

11
“Big Data” Sources
User Generated (Web &
It’s All Happening On-line Mobile)
Every:
Click
Ad impression
Billing event
….
Fast Forward, pause,… .
Server request
Transaction
Network message
Fault
…

Internet of Things / M2M Health/Scientific Computing

Graph Data
Lots of interesting data
has a graph structure:
• Social networks
• Communication networks
• Computer Networks
• Road networks
• Citations
• Collaborations/Relationships
• …

Some of these graphs can get

quite large (e.g., Facebook*
user graph)

13
There's certainly a lot of it!
Data, data everywhere…

1 Zettabyte 1.8 ZB 8.0 ZB

logarithmic scale
800 EB

Data produced each year

161 EB

5 EB
1 Exabyte

120 PB

100-years of HD video + audio

60 PB
Human brain's capacity
1 Petabyte 14 PB

1 Petabyte == 1000 TB 2002 2006 2009 2011 2015

1 TB = 1000 GB
References

(2015) 8 ZB: http://www.emc.com/collateral/analyst-reports/idc-extracting-value-from-chaos-ar.pdf (2002) 5 EB: http://www2.sims.berkeley.edu/research/projects/how-much-info-2003/execsum.htm

(2011) 1.8 ZB: http://www.emc.com/leadership/programs/digital-universe.htm (life in video) 60 PB: in 4320p resolution, extrapolated from 16MB for 1:21 of 640x480 video
(2009) 800 EB: http://www.emc.com/collateral/analyst-reports/idc-digital-universe-are-you-ready.pdf (w/sound) – almost certainly a gross overestimate, as sleep can be compressed significantly!

(2006) 161 EB: http://www.emc.com/collateral/analyst-reports/expanding-digital-idc-white-paper.pdf (brain) 14 PB: http://www.quora.com/Neuroscience-1/How-much-data-can-the-human-brain-store

“Data is the New Oil”
– World Economic Forum 2011
“Data Science” an Emerging Field

O’Reilly Radar report, 2011 16

Data Science – A Definition

Data Science is the science which uses computer

science, statistics and machine learning,
visualization and human-computer interactions
to collect, clean, integrate, analyze, visualize,
interact with data to create data products.

17
Goal of Data Science

Turn data into data products.

How to use data?
• Data => exploratory analysis => knowledge
models => product / decision marking
• Data => predictive models => evaluate /
interpret => product / decision making
Data Scientist’s Practice

Clean,
prep

Hypothesize Large Scale

Digging Around Model Exploitation
in Data

Evaluate
Interpret
Example data science applications
• Marketing: predict the characteristics of high life time
value (LTV) customers, which can be used to support
customer segmentation, identify upsell opportunities,
and support other marking initiatives
• Logistics: forecast how many of which things you need
and where will we need them, which enables learn
inventory and prevents out of stock situations
• Healthcare: analyze survival statistics for different
patient attributes (age, blood type, gender, etc.) and
treatments; predict risk of re-admittance based on
patient attributes, medical history, etc.
More Examples
• Transaction Databases  Recommender systems (NetFlix), Fraud
Detection (Security and Privacy)

• Wireless Sensor Data  Smart Home, Real-time Monitoring,

Internet of Things

• Text Data, Social Media Data  Product Review and Consumer

Satisfaction (Facebook, Twitter, LinkedIn), E-discovery

• Software Log Data  Automatic Trouble Shooting (Splunk)

• Genotype and Phenotype Data  Epic, 23andme, Patient-Centered

Care, Personalized Medicine
Data Science – One Definition
What’s Hard about Data Science
• Overcoming assumptions
• Making ad-hoc explanations of data patterns
• Overgeneralizing
• Communication
• Not checking enough (validate models, data pipeline
integrity, etc.)
• Using statistical tests correctly
• Prototype  Production transitions
• Data pipeline complexity (who do you ask?)
Data Science concerns
About the course
• A mixture of theory and practice
• Introductory, broad overview of subjects
• Focus on practical aspects, but not on ever-changing
technology and tools
• Seminar style - I am here to learn as well as to teach
• Language choice: python
– Relatively easy to learn (for computer scientist) compared
to R (more popular among statisticians)
– Open source means easy access (as opposed to SAS or
MATLAB)
– Which one is more frequently used in data science?
Textbook
• Required:
– Data Science from Scratch (DSS) by Joel
Grus

– Python for Data Analysis (PDA) by Wes

McKinney

– Free e-book: Think Stats (TS) by Allen B.

Downey. PDF | website

• Optional: Python Data Science

Handbook (PDSH) by Jake VanderPlas
Grading policy
• 5% attendance and participation
• 20% homework assignments and in-class
exercises
• 25% midterm exam
• 50% final exam / project

• I reserve the right to slightly adjust the

weights of individual components if necessary
Brief introduction of Python
• Invented in the Netherlands, early 90s by Guido
van Rossum
• Open sourced from the beginning
• Considered a scripting language, but is much
more
– No compilation needed
– Scripts are evaluated by the interpreter, line by line
– Functions need to be defined before they are called
Different ways to run python
• Call python program via python interpreter from a Unix/windows
command line
– $ python testScript.py
– Or make the script directly executable, with additional header lines in the
script
• Using python console
– Typing in python statements. Limited functionality
>>> 3 +3
6
>>> exit()
• Using ipython console
– Typing in python statements. Very interactive.
In [167]: 3+3
Out [167]: 6
– Typing in %run testScript.py
– Many convenient “magic functions”
Anaconda for python3
• We’ll be using anaconda which includes python
environment and an IDE (spyder) as well as many
additional features
– Can also use Enthought
• Most python modules needed in data science are
already installed with the anaconda distribution
• Install with python 3.6 (and install python 2.7 as
secondary from anaconda prompt)
• Key diff between Python 2 and python 3
Ipython magic functions
• who, whos, who_ls
• time, timeit
• debug
• pwd, ls, cd, etc.
• ?
• ??
Python programming in <2 hours
• This is not a comprehensive python language
class
• Will focus on parts of the language that is worth
attention and useful in data science
• Two parts:
– Basics - today
– More advanced – next week and/or as we go
• Comprehensive Python language reference and
tutorial available in Anacondo Navigator under
“Learning” and on python.org
Formatting
• Many languages use curly braces to delimit blocks of code. Python
uses indentation. Incorrect indentation causes error.
• Comments start with #
• Colons start a new block in many constructs, e.g. function
definitions, if-then clause, for, while

for i in [1, 2, 3, 4, 5]:

# first line in "for i" block
print (i)
for j in [1, 2, 3, 4, 5]:
# first line in "for j" block
print (j)
# last line in "for j" block
print (i + j)
# last line in "for i" block print "done looping
print (i)
print ("done looping”)
• Whitespace is ignored inside parentheses and
brackets.
long_winded_computation = (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 +
9 + 10 + 11 + 12 + 13 + 14 +
15 + 16 + 17 + 18 + 19 + 20)

list_of_lists = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

easier_to_read_list_of_lists =
[ [1, 2, 3],
[4, 5, 6],
[7, 8, 9] ]

Alternatively:
long_winded_computation = 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + \
9 + 10 + 11 + 12 + 13 + 14 + \
15 + 16 + 17 + 18 + 19 + 20
Modules
• Certain features of Python are not loaded by
default
• In order to use these features, you’ll need to
import the modules that contain them.
• E.g.
import matplotlib.pyplot as plt
import numpy as np
Variables and objects
• Variables are created the first time it is assigned a
value
– No need to declare type
– Types are associated with objects not variables
• X=5
• X = [1, 3, 5]
• X = ‘python’
– Assignment creates references, not copies
X = [1, 3, 5]
Y= X
X[0] = 2
Print (Y) # Y is [2, 3, 5]
Assignment
• You can assign to multiple names at the same
time
x, y = 2, 3
• To swap values
x, y = y, x
• Assignments can be chained
x=y=z=3
• Accessing a name before it’s been created (by
assignment), raises an error
Arithmetic
• a=5+2 # a is 7
• b = 9 – 3. # b is 6.0
• c=5*2 # c is 10
• d = 5**2 # d is 25
• e=5%2 # e is 1

Built in numerical types: int, float, complex

• f=7/2
# in python 2, f will be 3, unless “from __future__
import division”
• f = 7 / 2 # in python 3 f = 3.5
• f = 7 // 2 # f = 3 in both python 2 and 3
• f = 7 / 2. # f = 3.5 in both python 2 and 3

• f = 7 / float(2) # f is 3.5 in both python 2 and 3

• f = int(7 / 2) # f is 3 in both python 2 and 3
String - 1
• Strings can be delimited by matching single or double
quotation marks
single_quoted_string = 'data science'
double_quoted_string = "data science"
escaped_string = 'Isn\'t this fun'
another_string = "Isn't this fun"

real_long_string = 'this is a really long string. \

It has multiple parts, \
but all in one line.'

• Use triple quotes for multi line strings

multi_line_string = """This is the first line.
and this is the second line
and this is the third line"""
String - 2
• Use raw strings to output backslashes
tab_string = "\t" # represents the tab character
len(tab_string) # is 1

not_tab_string = r"\t" # represents the characters '\' and 't'

len(not_tab_string) # is 2

• Strings can be concatenated (glued together) with the + operator,

and repeated with *
s = 3 * 'un' + 'ium' # s is 'unununium'
• Two or more string literals (i.e. the ones enclosed between quotes)
next to each other are automatically concatenated
s1 = 'Py' 'thon'
s2 = s1 + '2.7'
real_long_string = ('this is a really long string. '
‘It has multiple parts, '
‘but all in one line.‘)
List - 1
integer_list = [1, 2, 3]
heterogeneous_list = ["string", 0.1, True]
list_of_lists = [ integer_list, heterogeneous_list, [] ]
list_length = len(integer_list) # equals 3
list_sum = sum(integer_list) # equals 6
• Get the i-th element of a list
x = [i for i in range(10)] # is the list [0, 1, ..., 9]
zero = x[0] # equals 0, lists are 0-indexed
one = x[1] # equals 1
nine = x[-1] # equals 9, 'Pythonic' for last element
eight = x[-2] # equals 8, 'Pythonic' for next-to-last element
• Get a slice of a list
one_to_four = x[1:5] # [1, 2, 3, 4]
first_three = x[:3] # [0, 1, 2]
last_three = x[-3:] # [7, 8, 9]
three_to_end = x[3:] # [3, 4, ..., 9]
without_first_and_last = x[1:-1] # [1, 2, ..., 8]
copy_of_x = x[:] # [0, 1, 2, ..., 9]
another_copy_of_x = x[:3] + x[3:] # [0, 1, 2, ..., 9]
List - 2
• Check for memberships
1 in [1, 2, 3] # True
0 in [1, 2, 3] # False
• Concatenate lists
x = [1, 2, 3]
y = [4, 5, 6]
x.extend(y) # x is now [1,2,3,4,5,6]

x = [1, 2, 3]
y = [4, 5, 6]
z = x + y # z is [1,2,3,4,5,6]; x is unchanged.
• List unpacking (multiple assignment)
x, y = [1, 2] # x is 1 and y is 2
[x, y] = 1, 2 # same as above
x, y = [1, 2] # same as above
x, y = 1, 2 # same as above
_, y = [1, 2] # y is 2, didn't care about the first element
List - 3
• Modify content of list
x = [0, 1, 2, 3, 4, 5, 6, 7, 8]
x[2] = x[2] * 2 # x is [0, 1, 4, 3, 4, 5, 6, 7, 8]
x[-1] = 0 # x is [0, 1, 4, 3, 4, 5, 6, 7, 0]
x[3:5] = x[3:5] * 3 # x is [0, 1, 4, 9, 12, 5, 6, 7, 0]
x[5:6] = [] # x is [0, 1, 4, 9, 12, 7, 0]
del x[:2] # x is [4, 9, 12, 7, 0]
del x[:] # x is []
del x # referencing to x hereafter is a NameError

• Strings can also be sliced. But they cannot modified (they are immutable)
s = 'abcdefg'
a = s[0] # 'a'
x = s[:2] # 'ab'
y = s[-3:] # 'efg'
s[:2] = 'AB' # this will cause an error
s = 'AB' + s[2:] # str is now ABcdefg
The range() function
for i in range(5):
print (i) # will print 0, 1, 2, 3, 4 (in separate lines)
for i in range(2, 5):
print (i) # will print 2, 3, 4
for i in range(0, 10, 2):
print (i) # will print 0, 2, 4, 6, 8
for i in range(10, 2, -2):
print (i) # will print 10, 8, 6, 4
>>> a = ['Mary', 'had', 'a', 'little', 'lamb']
>>> for i in range(len(a)):
... print(i, a[i])
...
0 Mary
1 had
2 a
3 little
4 lamb
Range() in python 2 and 3
• In python 2, range(5) is equivalent to [0, 1, 2, 3, 4]
• In python 3, range(5) is an object which can be iterated,
but not identical to [0, 1, 2, 3, 4] (lazy iterator)

print (range(3)) # in python 3, will see "range(0, 3)"

print (range(3)) # in python 2, will see "[0, 1, 2]"
print (list(range(3))) # will print [0, 1, 2] in python 3

x = range(5)
print (x[2]) # in python 2, will print "2"
print (x[2]) # in python 3, will also print “2”

x[2] = 5 # in python 2, will result in [0, 1, 5, 3, 4, 5]

x[2] = 5 # in python 3, will cause an error.
Ref to lists
• What are the expected output for the following code?

a = list(range(10))
b = a
b[0] = 100
print(a) [100, 1, 2, 3, 4, 5, 6, 7, 8, 9]

a = list(range(10))
b = a[:]
b[0] = 100
print(a) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
tuples
• Similar to lists, but are immutable
• a_tuple = (0, 1, 2, 3, 4) Note: tuple is defined by comma, not parens,
which is only used for convenience. So a = (1)
• Other_tuple = 3, 4 is not a tuple, but a = (1,) is.
• Another_tuple = tuple([0, 1, 2, 3, 4])
• Hetergeneous_tuple = (‘john’, 1.1, [1, 2])

• Can be sliced, concatenated, or repeated

a_tuple[2:4] # will print (2, 3)
• Cannot be modified
a_tuple[2] = 5
TypeError: 'tuple' object does not support item assignment
Tuples - 2
• Useful for returning multiple values from
functions
def sum_and_product(x, y):
return (x + y),(x * y)
sp = sum_and_product(2, 3) # equals (5, 6)
s, p = sum_and_product(5, 10) # s is 15, p is 50
• Tuples and lists can also be used for multiple
assignments
x, y = 1, 2
[x, y] = [1, 2]
(x, y) = (1, 2)
x, y = y, x
Dictionaries
• A dictionary associates values with unique keys
empty_dict = {} # Pythonic
empty_dict2 = dict() # less Pythonic
grades = { "Joel" : 80, "Tim" : 95 } # dictionary literal

• Access/modify value with key

joels_grade = grades["Joel"] # equals 80

grades["Tim"] = 99 # replaces the old value

grades["Kate"] = 100 # adds a third entry
num_students = len(grades) # equals 3

try:
kates_grade = grades["Kate"]
except KeyError:
print "no grade for Kate!"
Dictionaries - 2
• Check for existence of key
joel_has_grade = "Joel" in grades # True
kate_has_grade = "Kate" in grades # False

• Use “get” to avoid keyError and add default value

joels_grade = grades.get("Joel", 0) # equals 80
kates_grade = grades.get("Kate", 0) # equals 0
no_ones_grade = grades.get("No One") # default
default is None
#Which of the following is faster?

• Get all items

'Joel' in grades # faster. Hashtable
'Joel' in all_keys # slower. List.

all_keys = grades.keys() # return a list of all keys

all_values = grades.values() # return a list of all values
all_pairs = grades.items() # a list of (key, value) tuples
Dictionaries - 2
• Check for existence of key
joel_has_grade = "Joel" in grades # True
kate_has_grade = "Kate" in grades # False

• Use “get” to avoid keyError and add default value

joels_grade = grades.get("Joel", 0) # equals 80
kates_grade = grades.get("Kate", 0) # equals 0
no_ones_grade = grades.get("No One") # default
default is None

• Get all items In python3, The following will not return lists but
iterable objects
all_keys = grades.keys() # return a list of all keys
all_values = grades.values() # return a list of all values
all_pairs = grades.items() # a list of (key, value) tuples
Difference between python 2 and
python 3: Iterable objects vs lists
• In Python 3, range() returns a lazy iterable object.
– Value created when needed x = range(10000000) #fast
– Can be accessed by index x[10000] #allowed. fast

• Similarly, dict.keys(), dict.values(), and dict.items()

(also map, filter, zip, see next)
– Value can NOT be accessed by index
– Can convert to list if really needed
– Can use for loop to iterate
keys = grades.keys()
keys[0] # error
for key in keys: print (key) #ok
Control flow - 1
• if-else
if 1 > 2:
message = "if only 1 were greater than two..."
elif 1 > 3:
message = "elif stands for 'else if'"
else:
message = "when all else fails use else (if you want to)"
print (message)

parity = "even" if x % 2 == 0 else "odd"

• Difference between python 2 and python3 print
• In python 2, print is a statement
• Print(message) and print message are both valid
• In python 3, print is a function
• Only print(message) is valid
Truthiness
• True All keywords are case sensitive.
• False 0, 0.0, [], (), ‘’, None are considered
False. Most other values are True.
• None
• and In [137]: print ("True") if '' else print ('False')
False
• or
• not a = [0, 0, 0, 1]

• any any(a)
Out[135]: True
• all all(a)
Out[136]: False
Comparison
Operation Meaning a = [0, 1, 2, 3, 4]
b = a
< strictly less than c = a[:]
<= less than or equal
a == b
> strictly greater than Out[129]: True

>= greater than or equal a is b

Out[130]: True
== equal

!= not equal a == c
Out[132]: True
is object identity
a is c
is not negated object identity Out[133]: False

Bitwise operators: & (AND), | (OR), ^ (XOR), ~(NOT), << (Left Shift), >> (Right Shift)
Control flow - 2
• loops
x = 0
while x < 10:
print (x, "is less than 10“)
x += 1

What happens if we forgot to indent?

for x in range(10): Keyword pass in loops:

pass Does nothing, empty statement placeholder

for x in range(10):
if x == 3:
continue # go immediately to the next iteration
if x == 5:
break # quit the loop entirely
print (x)
Exceptions
try:
print 0 / 0
except ZeroDivisionError:
print ("cannot divide by zero")

https://docs.python.org/3/tutorial/errors.html
Functions - 1
• Functions are defined using def
def double(x):
"""this is where you put an optional docstring
that explains what the function does.
for example, this function multiplies its
input by 2"""
return x * 2
• You can call a function after it is defined
z = double(10) # z is 20
• You can give default values to parameters
def my_print(message="my default message"):
print (message)

my_print("hello") # prints 'hello'

my_print() # prints 'my default message‘
Functions - 2
• Sometimes it is useful to specify arguments by name
def subtract(a=0, b=0):
return a – b

subtract(10, 5) # returns 5
subtract(0, 5) # returns -5
subtract(b = 5) # same as above
subtract(b = 5, a = 0) # same as above
Functions - 3
• Functions are objects too
In [12]: def double(x): return x * 2
...: DD = double;
...: DD(2)
...:
Out[12]: 4
In [16]: def apply_to_one(f):
...: return f(1)
...: x=apply_to_one(DD)
...: x
...:
Out[16]: 2
Functions – lambda expression
• Small anonymous functions can be created
with the lambda keyword.
In [18]: y=apply_to_one(lambda x: x+4)

In [19]: y
Out[19]: 5

In [104]: def small_func(x): return x+4

...: apply_to_one(small_func)
Out[104]: 5
lambda expression - 2
• Small anonymous functions can be created
with the lambda keyword.
In [22]: pairs = [(2, 'two'), (3, 'three'), (1, 'one'), (4, 'four')]
...: pairs.sort(key=lambda pair: pair[0])
...: pairs
Out[22]: [(1, 'one'), (2, 'two'), (3, 'three'), (4, 'four')]

In [107]: def getKey(pair): return pair[0]

...: pairs.sort(key=getKey)
...: pairs
Out[107]: [(1, 'one'), (2, 'two'), (3, 'three'), (4, 'four')
Sorting list
• Sorted(list): keeps the original list intact and returns
a new sorted list
• list.sort: sort the original list
x = [4,1,2,3]
y = sorted(x) # is [1,2,3,4], x is unchanged
x.sort() # now x is [1,2,3,4]

• Change the default behavior of sorted

# sort the list by absolute value from largest to smallest
x = [-4,1,-2,3]
y = sorted(x, key=abs, reverse=True) # is [-4,3,-2,1]
# sort the grades from highest count to lowest
# using an anonymous function
newgrades = sorted(grades.items(),
key=lambda (name, grade): grade,
reverse=True)
List comprehension
• A very convenient way to create a new list

In [51]: squares = [x * x for x in range(5)]

In [52]: squares
Out[52]: [0, 1, 4, 9, 16]

In [64]: for x in range(5): squares[x] = x

*x
...: squares
Out[64]: [0, 1, 4, 9, 16]
List comprehension - 2
• Can also be used to filter list
In [65]: even_numbers = [x for x in range(5) if x % 2 == 0]
In [66]: even_numbers
Out[66]: [0, 2, 4]

In [68]: even_numbers = []
In [69]: for x in range(5):
...: if x % 2 == 0:
...: even_numbers.append(x)
...: even_numbers
Out[69]: [0, 2, 4]
List comprehension - 3
• More complex examples:
# create 100 pairs (0,0) (0,1) ... (9,8), (9,9)
pairs = [(x, y)
for x in range(10)
for y in range(10)]

# only pairs with x < y,

# range(lo, hi) equals
# [lo, lo + 1, ..., hi - 1]
increasing_pairs = [(x, y)
for x in range(10)
for y in range(x + 1, 10)]
Functools: map, reduce, filter
• Do not confuse with MapReduce in big data
• Convenient tools in python to apply function
to sequences of data
In [203]: def double(x): return 2*x In [205]: [double(i) for i in range(5)]
...: b=range(5) Out[205]: [0, 2, 4, 6, 8]
...: list(map(double, b))
Out[203]: [0, 2, 4, 6, 8]

In [204]: double(b)
Traceback (most recent call last):
…
TypeError: unsupported operand type(s) for *: 'int' and 'range'
Functools: map, reduce, filter
• Do not confuse with MapReduce in big data
• Convenient tools in python to apply function
to sequences of data
In [208]: def is_even(x): return x%2==0
...: a=[0, 1, 2, 3]
...: list(filter(is_even, a))
...:
Out[208]: [0, 2]

In [209]: [a[i] for i in a if is_even(i)]

Out[209]: [0, 2]
Functools: map, reduce, filter
• Do not confuse with MapReduce in big data
• Convenient tools in python to apply function
to sequences of data
In [216]: from functools import reduce
In [217]: reduce(lambda x, y: x+y, range(10))
Out[217]: 45

In [220]: reduce(lambda x, y: x*y, [1, 2, 3, 4])

Out[220]: 24
zip
• Useful to combined multiple lists into a list of
tuples
In [238]: list(zip(['a', 'b', 'c'], [1, 2, 3], ['A', 'B', 'C']))
Out[238]: [('a', 1, 'A'), ('b', 2, 'B'), ('c', 3, 'C')]
In [245]: names = ['James', 'Tom', 'Mary']
...: grades = [100, 90, 95]
...: list(zip(names, grades))
...:
Out[245]: [('James', 100), ('Tom', 90), ('Mary', 95)]
Argument unpacking
• zip(*[a, b,c]) same as zip(a, b, c)
In [252]: gradeBook = [['James', 100],
['Tom', 90],
['Mary', 95]]
...: [names, grades]=zip(*gradeBook)
In [253]: names
Out[253]: ('James', 'Tom', 'Mary')
In [254]: grades
Out[254]: (100, 90, 95)

In [259]: list(zip(['James', 100], ['Tom', 90], ['Mary', 95]))

Out[259]: [('James', 'Tom', 'Mary'), (100, 90, 95)]
args and kargs
• Convenient for taking variable number of
unnamed and named parameters
In [260]: def magic(*args, **kwargs):
...: print ("unnamed args:", args)
...: print ("keyword args:", kwargs)
...: magic(1, 2, key="word", key2="word2")
...:
unnamed args: (1, 2)
keyword args: {'key': 'word', 'key2': 'word2'}
Useful methods and modules
• The Python Tutorial
– Input and Output
• The Python Standard Library Reference
– Common string methods
– Regular expression operations
– Numeric and Mathematical Modules
– CSV File Reading and Writing
Files - input
inflobj = open(‘data’, ‘r’) Open the file ‘data’ for
input
S = inflobj.read() Read whole file into one
String
S = inflobj.read(N) Reads N bytes (N >= 1)

L = inflobj.readline () Read one line

L = inflobj.readlines() Returns a list of line strings

https://docs.python.org/3/tutorial/inputoutput.html
Files - output

outflobj = open(‘data’, ‘w’) Open the file ‘data’

for writing
outflobj.write(S) Writes the string S to
file
outflobj.writelines(L) Writes each of the
strings in list L to file
outflobj.close() Closes the file

https://docs.python.org/3/tutorial/inputoutput.html
Module math
Command name Description Constant Description
abs(value) absolute value e 2.7182818...
ceil(value) rounds up pi 3.1415926...
cos(value) cosine, in radians
floor(value) rounds down
log(value) logarithm, base e
log10(value) logarithm, base 10
max(value1, value2) larger of two values
min(value1, value2) smaller of two values
round(value) nearest whole number # preferred.
sin(value) sine, in radians import math
sqrt(value) square root math.abs(-0.5)

#bad style. Many unknown #This is fine

#names in name space. from math import abs
from math import * abs(-0.5)
abs(-0.5)
Module random
• Generating random numbers are important in
statistics
In [75]: import random
...: four_uniform_randoms = [random.random() for _ in range(4)]
...: four_uniform_randoms
...:
Out[75]:
[0.5687302894847388,
0.6562738117250464,
0.3396960191199996,
0.016968446644451407]
• Other useful functions: seed(), randint, randrange, shuffle, etc.
• Type in “random” and then use tab completion to see available
functions and use “?” to see docstring of function.
Important python modules for data
science
• Numpy
– Key module for scientific computing
– Convenient and efficient ways to handle multi
dimensional arrays
• pandas
– DataFrame
– Flexible data structure of labeled tabular data
• Matplotlib: for plotting
• Scipy: solutions to common scientific computing
problem such as linear algebra, optimization,
statistics, sparse matrix
Module paths
• In order to be able to find a module called myscripts.py, the
interpreter scans the list sys.path of directory names.
• The module must be in one of those directories.

>>> import sys

>>> sys.path
['C:\\Python26\\Lib\\idlelib', 'C:\\WINDOWS\\system32\\python26.zip',
'C:\\Python26\\DLLs', 'C:\\Python26\\lib', 'C:\\Python26\\lib\\plat-win',
'C:\\Python26\\lib\\lib-tk', 'C:\\Python26', 'C:\\Python26\\lib\\site-
packages']
>>> import myscripts
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
import myscripts.py
ImportError: No module named myscripts.py
Appendix
Sequence types: Tuples,
Lists, and Strings
Sequence Types
1. Tuple: (‘john’, 32, [CMSC])
 A simple immutable ordered sequence of
items
 Items can be of mixed types, including
collection types
2. Strings: “John Smith”
– Immutable
– Conceptually very much like a tuple
3. List: [1, 2, ‘john’, (‘up’, ‘down’)]
 Mutable ordered sequence of items of mixed
types
Similar Syntax
• All three sequence types (tuples, strings, and
lists) share much of the same syntax and
functionality.
• Key difference:
– Tuples and strings are immutable
– Lists are mutable
• The operations shown in this section can be
applied to all sequence types
– most examples will just show the operation
performed on one
Defining Sequence
• Define tuples using parentheses and commas
>>> tu = (23, ‘abc’, 4.56, (2,3),
‘def’)
• Define lists are using square brackets and commas
>>> li = [“abc”, 34, 4.34, 23]
• Define strings using quotes (“, ‘, or “““).
>>> st = “Hello World”
>>> st = ‘Hello World’
>>> st = “““This is a multi-line
string that uses triple quotes.”””
Accessing one element
• Access individual members of a tuple, list, or string
using square bracket “array” notation
• Note that all are 0 based…
>>> tu = (23, ‘abc’, 4.56, (2,3), ‘def’)
>>> tu[1] # Second item in the tuple.
‘abc’
>>> li = [“abc”, 34, 4.34, 23]
>>> li[1] # Second item in the list.
34
>>> st = “Hello World”
>>> st[1] # 2nd character in string. Still str type
‘e’
Positive and negative indices

>>> t = (23, ‘abc’, 4.56, (2,3),

‘def’)
Positive index: count from the left, starting with 0
>>> t[1]
‘abc’
Negative index: count from right, starting with –1
>>> t[-3]
4.56
Slicing: return copy of a subset

>>> t = (23, ‘abc’, 4.56,

(2,3), ‘def’)
Return a copy of the container with a subset of the
original members. Start copying at the first index,
and stop copying before second.
>>> t[1:4]
(‘abc’, 4.56, (2,3))
Negative indices count from end
>>> t[1:-1]
(‘abc’, 4.56, (2,3))
Slicing: return copy of a subset

>>> t = (23, ‘abc’, 4.56,

(2,3), ‘def’)
Omit first index to make copy starting from
beginning of the container
>>> t[:2]
(23, ‘abc’)
Omit second index to make copy starting at first
index and going to end
>>> t[2:]
(4.56, (2,3), ‘def’)
Copying the Whole Sequence
• [ : ] makes a copy of an entire sequence
>>> t[:]
(23, ‘abc’, 4.56, (2,3), ‘def’)
• Note the difference between these two lines for mutable
sequences
>>> l2 = l1 # Both refer to 1 ref,
# changing one affects
both
>>> l2 = l1[:] # Independent copies,
two refs
The ‘in’ Operator
• Boolean test whether a value is inside a
container:
>>> t = [1, 2, 4, 5]
>>> 3 in t
False
>>> 4 in t
True
>>> 4 not in t
False
• For strings, tests for substrings
>>> a = 'abcde'
>>> 'c' in a
True
>>> 'cd' in a
True
>>> 'ac' in a
False
The + Operator
The + operator produces a new tuple, list, or string whose
value is the concatenation of its arguments.

>>> (1, 2, 3) + (4, 5, 6)

(1, 2, 3, 4, 5, 6)

>>> [1, 2, 3] + [4, 5, 6]

[1, 2, 3, 4, 5, 6]

>>> “Hello” + “ ” + “World”

‘Hello World’
The * Operator

• The * operator produces a new tuple, list, or string

that “repeats” the original content.
>>> (1, 2, 3) * 3
(1, 2, 3, 1, 2, 3, 1, 2, 3)

>>> [1, 2, 3] * 3
[1, 2, 3, 1, 2, 3, 1, 2, 3]

>>> “Hello” * 3
‘HelloHelloHello’
Mutability:
Tuples vs. Lists
Lists are mutable
>>> li = [‘abc’, 23, 4.34, 23]
>>> li[1] = 45
>>> li
[‘abc’, 45, 4.34, 23]
• We can change lists in place.
• Name li still points to the same memory
reference when we’re done.
Tuples are immutable
>>> t = (23, ‘abc’, 4.56, (2,3), ‘def’)
>>> t[2] = 3.14
Traceback (most recent call last):
File "<pyshell#75>", line 1, in -toplevel-
tu[2] = 3.14
TypeError: object doesn't support item assignment

• You can’t change a tuple.

• You can make a fresh tuple and assign its reference
to a previously used name.
>>> t = (23, ‘abc’, 3.14, (2,3), ‘def’)
• The immutability of tuples means they’re faster
than lists.
Operations on Lists Only
>>> li = [1, 11, 3, 4, 5]

>>> li.append(‘a’) # Note the

method syntax
>>> li
[1, 11, 3, 4, 5, ‘a’]

>>> li.insert(2, ‘i’)

>>>li
[1, 11, ‘i’, 3, 4, 5, ‘a’]
The extend method vs +
• + creates a fresh list with a new memory ref
• extend operates on list li in place.
>>> li.extend([9, 8, 7])
>>> li
[1, 2, ‘i’, 3, 4, 5, ‘a’, 9, 8, 7]
• Potentially confusing:
– extend takes a list as an argument.
– append takes a singleton as an argument.
>>> li.append([10, 11, 12])
>>> li
[1, 2, ‘i’, 3, 4, 5, ‘a’, 9, 8, 7, [10, 11,
12]]
Operations on Lists Only
Lists have many methods, including index, count, remove,
reverse, sort
>>> li = [‘a’, ‘b’, ‘c’, ‘b’]
>>> li.index(‘b’) # index of 1st
occurrence
1
>>> li.count(‘b’) # number of
occurrences
2
>>> li.remove(‘b’) # remove 1st
occurrence
>>> li
[‘a’, ‘c’, ‘b’]
Operations on Lists Only
>>> li = [5, 2, 6, 8]

>>> li.reverse() # reverse the list in place

>>> li
[8, 6, 2, 5]

>>> li.sort() # sort the list in place

>>> li
[2, 5, 6, 8]

>>> li.sort(some_function)
# sort in place using user-defined comparison
Tuple details
• The comma is the tuple creation operator, not
parens
>>> 1,
(1,)

• Python shows parens for clarity (best practice)

>>> (1,)
(1,)

• Don't forget the comma!

>>> (1)
1

• Trailing comma only required for singletons

others
• Empty tuples have a special syntactic form
>>> ()
()
>>> tuple()
()
Summary: Tuples vs. Lists
• Lists slower but more powerful than tuples
– Lists can be modified, and they have lots of handy
operations and mehtods
– Tuples are immutable and have fewer features
• To convert between tuples and lists use the list() and
tuple() functions:
li = list(tu)
tu = tuple(li)

Unit 3 - Operating System - WWW - Rgpvnotes.in
No ratings yet
Unit 3 - Operating System - WWW - Rgpvnotes.in
38 pages
Programming With Python and GUI Development... 2024
No ratings yet
Programming With Python and GUI Development... 2024
145 pages
Scientific Python Lectures (Source - Lectures - Scientific-Python - Org)
No ratings yet
Scientific Python Lectures (Source - Lectures - Scientific-Python - Org)
690 pages
MLCourse Slides
No ratings yet
MLCourse Slides
356 pages
Python
No ratings yet
Python
323 pages
Ccna Devnet
No ratings yet
Ccna Devnet
492 pages
MVoc Software Application Development2023
No ratings yet
MVoc Software Application Development2023
92 pages
Data Science Python
No ratings yet
Data Science Python
42 pages
Python Notes
No ratings yet
Python Notes
279 pages
ML Complete Notes-AIDS
No ratings yet
ML Complete Notes-AIDS
115 pages
Case Study Data Science Business
100% (1)
Case Study Data Science Business
805 pages
Agya Ram Verma - Yatendra Kumar - Basic and Advance - Phython Programming-Independently Published (2024)
No ratings yet
Agya Ram Verma - Yatendra Kumar - Basic and Advance - Phython Programming-Independently Published (2024)
240 pages
?????? ???????????!
No ratings yet
?????? ???????????!
129 pages
Master Python E Book 1
No ratings yet
Master Python E Book 1
257 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
9 pages
CHP 5 Communication
100% (1)
CHP 5 Communication
59 pages
003 Python-Syntax-Cheat-Sheet-Booklet
No ratings yet
003 Python-Syntax-Cheat-Sheet-Booklet
26 pages
INTRODUCTION TO PYTHON Version 1 WITH SO
No ratings yet
INTRODUCTION TO PYTHON Version 1 WITH SO
158 pages
Quantecon Python Programming
No ratings yet
Quantecon Python Programming
388 pages
Use Case Diagram
No ratings yet
Use Case Diagram
42 pages
Week 1 Introduction To ML
100% (1)
Week 1 Introduction To ML
42 pages
BIG-IP v11.2 Customer Presentation
No ratings yet
BIG-IP v11.2 Customer Presentation
50 pages
DR Antonio Gulli - A Collection of Advanced Data Science and Machine Learning Interview Questions Solved in Python and Spark (II) - Hands-On Big Data and Machine - Programming Interview Questions) (
No ratings yet
DR Antonio Gulli - A Collection of Advanced Data Science and Machine Learning Interview Questions Solved in Python and Spark (II) - Hands-On Big Data and Machine - Programming Interview Questions) (
112 pages
Scikit Learn Docs
No ratings yet
Scikit Learn Docs
1,810 pages
Pandas
100% (1)
Pandas
1,131 pages
The Orthodox Christian Mission
No ratings yet
The Orthodox Christian Mission
3 pages
Sustainable Web Development With Ruby On Rails P2.0
No ratings yet
Sustainable Web Development With Ruby On Rails P2.0
487 pages
APIs 101 Workshop
No ratings yet
APIs 101 Workshop
28 pages
DataScienceWithPython Ed2018
No ratings yet
DataScienceWithPython Ed2018
66 pages
Advanced Auditing
No ratings yet
Advanced Auditing
76 pages
Unit 2 Unit 2
No ratings yet
Unit 2 Unit 2
12 pages
Chapter 5 - Data Exploration and Visualization With
No ratings yet
Chapter 5 - Data Exploration and Visualization With
39 pages
Ch07 Web Security
No ratings yet
Ch07 Web Security
117 pages
Costing Notes
No ratings yet
Costing Notes
48 pages
Sermon Notes: "The Good Life?" (Luke 12:13-21)
No ratings yet
Sermon Notes: "The Good Life?" (Luke 12:13-21)
3 pages
Software Engineering Chapter 7
No ratings yet
Software Engineering Chapter 7
20 pages
Bain Report Long Live Luxury Converge To Expand Through Turbulence
No ratings yet
Bain Report Long Live Luxury Converge To Expand Through Turbulence
32 pages
Python Function
No ratings yet
Python Function
22 pages
Unit 1
No ratings yet
Unit 1
86 pages
11 Mehra Borazjany OOAD Part1
No ratings yet
11 Mehra Borazjany OOAD Part1
204 pages
Flask Restplus
No ratings yet
Flask Restplus
86 pages
CSE-Machine Learning & Big Data - WSS Source Book
No ratings yet
CSE-Machine Learning & Big Data - WSS Source Book
181 pages
Final N
No ratings yet
Final N
13 pages
I Think Unix
No ratings yet
I Think Unix
299 pages
MBA Managerial Economics Unit 1 - Economic Problems and Decision Making
No ratings yet
MBA Managerial Economics Unit 1 - Economic Problems and Decision Making
24 pages
32.M.E. Software Engineering
No ratings yet
32.M.E. Software Engineering
58 pages
Data Analytics Using Python
100% (1)
Data Analytics Using Python
8 pages
2023 Updated Huawei H12-711 - V40-ENU Exam Dumps - PDF Room
No ratings yet
2023 Updated Huawei H12-711 - V40-ENU Exam Dumps - PDF Room
27 pages
Automation of Event Correlation and Clustering With Built in Machine Learning Algorithms in Splunk It Service Intelligence Itsi PDF
No ratings yet
Automation of Event Correlation and Clustering With Built in Machine Learning Algorithms in Splunk It Service Intelligence Itsi PDF
26 pages
Week 1
No ratings yet
Week 1
32 pages
NSO Day 2 Yang XML and Rest Api
No ratings yet
NSO Day 2 Yang XML and Rest Api
101 pages
Course Structure: Master in Computer Applications (MCA) (Two Years Programme)
No ratings yet
Course Structure: Master in Computer Applications (MCA) (Two Years Programme)
73 pages
How To Break Web Software
No ratings yet
How To Break Web Software
43 pages
Klaros-Testmanagement User Manual
No ratings yet
Klaros-Testmanagement User Manual
231 pages
The Algerian Democratic Republic High School: Khalifa Ben Mahmoud First English Exam
100% (2)
The Algerian Democratic Republic High School: Khalifa Ben Mahmoud First English Exam
3 pages
Introduction To Python 1
No ratings yet
Introduction To Python 1
13 pages
Reading - Toefl
100% (1)
Reading - Toefl
10 pages
Defining Threat Intelligence Analysis Slides
No ratings yet
Defining Threat Intelligence Analysis Slides
16 pages
Chapter 4 PPT
No ratings yet
Chapter 4 PPT
30 pages
Essential Python Libraries and Functions For Data Science 1706295212
No ratings yet
Essential Python Libraries and Functions For Data Science 1706295212
12 pages
Python For Data Science
No ratings yet
Python For Data Science
20 pages
Ogunka 3 PDF
No ratings yet
Ogunka 3 PDF
18 pages
Noun. (1) The French Indirect Object Pronouns Are
No ratings yet
Noun. (1) The French Indirect Object Pronouns Are
4 pages
Fake News Detection
No ratings yet
Fake News Detection
14 pages
Summary of Major Events and Problems - US Army Chemical Corps 1959
No ratings yet
Summary of Major Events and Problems - US Army Chemical Corps 1959
42 pages
English Grammar For ESL Learners
No ratings yet
English Grammar For ESL Learners
3 pages
Week 1
No ratings yet
Week 1
18 pages
EASA Module 15 - Engine Fire Protection System Question
No ratings yet
EASA Module 15 - Engine Fire Protection System Question
11 pages
76 Command Set
No ratings yet
76 Command Set
27 pages
Microsoft Azure Security Overview
No ratings yet
Microsoft Azure Security Overview
36 pages
Scan Converting Circle
No ratings yet
Scan Converting Circle
36 pages
Hewlett-Packard Journal: February 1971
No ratings yet
Hewlett-Packard Journal: February 1971
16 pages
Applied Coding Track
No ratings yet
Applied Coding Track
10 pages
DR Ian Reid B4, 4 Lectures, Hilary Term: Software Engineering
No ratings yet
DR Ian Reid B4, 4 Lectures, Hilary Term: Software Engineering
85 pages
C - 16922312 - Shafa Raisa Hazet - P-3.4 - 1
No ratings yet
C - 16922312 - Shafa Raisa Hazet - P-3.4 - 1
10 pages
Software Engineering Unit 3 Part 2: Q) Explain The Lehman's Law Principles of Lehman'S Laws
No ratings yet
Software Engineering Unit 3 Part 2: Q) Explain The Lehman's Law Principles of Lehman'S Laws
9 pages
Public Notice: Dr. NTR University of Health Sciences: Andhra Pradesh
No ratings yet
Public Notice: Dr. NTR University of Health Sciences: Andhra Pradesh
4 pages
CUCOH 2013 Executive Application
No ratings yet
CUCOH 2013 Executive Application
4 pages
Software Engineering CH1 Slides
No ratings yet
Software Engineering CH1 Slides
24 pages
Vaidyasala Malayalam Note
No ratings yet
Vaidyasala Malayalam Note
2 pages
Ref - Integrity Problems of Concrete Piles - FPrimeC - FPrimeC Solutions Inc
No ratings yet
Ref - Integrity Problems of Concrete Piles - FPrimeC - FPrimeC Solutions Inc
7 pages
Hybrid Organizations:: O, S, I, I
No ratings yet
Hybrid Organizations:: O, S, I, I
8 pages
Apple Inc Company:: Foundation
No ratings yet
Apple Inc Company:: Foundation
5 pages
December 2 Flier Final-NEW PDF
No ratings yet
December 2 Flier Final-NEW PDF
1 page
Dectection Theory Packet
No ratings yet
Dectection Theory Packet
4 pages
Procedimiento Actualización SW Juniper
No ratings yet
Procedimiento Actualización SW Juniper
4 pages
MCQ Practice Links
No ratings yet
MCQ Practice Links
3 pages
Haas ST-10 Series Lathes: The High-Performance Turning Centers
No ratings yet
Haas ST-10 Series Lathes: The High-Performance Turning Centers
2 pages
Overcome Top Five Data Protection Challenges
No ratings yet
Overcome Top Five Data Protection Challenges
1 page
Web Commerce Security: Design and Development
From Everand
Web Commerce Security: Design and Development
Hadi Nahari
No ratings yet

SENG419-python 98745

Uploaded by

SENG419-python 98745

Uploaded by

Introduction to Data Science

Couse intro & Python tutorial

Bilal Shoaib Khan

New models are estimating

Internet of Things / M2M Health/Scientific Computing

Some of these graphs can get

1 Zettabyte 1.8 ZB 8.0 ZB

Data produced each year

100-years of HD video + audio

1 Petabyte == 1000 TB 2002 2006 2009 2011 2015

(2015) 8 ZB: http://www.emc.com/collateral/analyst-reports/idc-extracting-value-from-chaos-ar.pdf (2002) 5 EB: http://www2.sims.berkeley.edu/research/projects/how-much-info-2003/execsum.htm

(2006) 161 EB: http://www.emc.com/collateral/analyst-reports/expanding-digital-idc-white-paper.pdf (brain) 14 PB: http://www.quora.com/Neuroscience-1/How-much-data-can-the-human-brain-store

O’Reilly Radar report, 2011 16

Data Science is the science which uses computer

Turn data into data products.

Hypothesize Large Scale

• Wireless Sensor Data  Smart Home, Real-time Monitoring,

• Text Data, Social Media Data  Product Review and Consumer

• Software Log Data  Automatic Trouble Shooting (Splunk)

• Genotype and Phenotype Data  Epic, 23andme, Patient-Centered

– Python for Data Analysis (PDA) by Wes

– Free e-book: Think Stats (TS) by Allen B.

• Optional: Python Data Science

• I reserve the right to slightly adjust the

for i in [1, 2, 3, 4, 5]:

list_of_lists = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

Built in numerical types: int, float, complex

• f = 7 / float(2) # f is 3.5 in both python 2 and 3

real_long_string = 'this is a really long string. \

• Use triple quotes for multi line strings

not_tab_string = r"\t" # represents the characters '\' and 't'

• Strings can be concatenated (glued together) with the + operator,

print (range(3)) # in python 3, will see "range(0, 3)"

x[2] = 5 # in python 2, will result in [0, 1, 5, 3, 4, 5]

• Can be sliced, concatenated, or repeated

• Access/modify value with key

grades["Tim"] = 99 # replaces the old value

• Use “get” to avoid keyError and add default value

• Get all items

all_keys = grades.keys() # return a list of all keys

• Use “get” to avoid keyError and add default value

• Similarly, dict.keys(), dict.values(), and dict.items()

parity = "even" if x % 2 == 0 else "odd"

>= greater than or equal a is b

What happens if we forgot to indent?

for x in range(10): Keyword pass in loops:

my_print("hello") # prints 'hello'

In [104]: def small_func(x): return x+4

In [107]: def getKey(pair): return pair[0]

• Change the default behavior of sorted

In [51]: squares = [x * x for x in range(5)]

In [64]: for x in range(5): squares[x] = x

# only pairs with x < y,

In [209]: [a[i] for i in a if is_even(i)]

In [220]: reduce(lambda x, y: x*y, [1, 2, 3, 4])

In [259]: list(zip(['James', 100], ['Tom', 90], ['Mary', 95]))

L = inflobj.readline () Read one line

L = inflobj.readlines() Returns a list of line strings

outflobj = open(‘data’, ‘w’) Open the file ‘data’

#bad style. Many unknown #This is fine

>>> import sys

>>> t = (23, ‘abc’, 4.56, (2,3),

>>> t = (23, ‘abc’, 4.56,

>>> t = (23, ‘abc’, 4.56,

>>> (1, 2, 3) + (4, 5, 6)

>>> [1, 2, 3] + [4, 5, 6]

>>> “Hello” + “ ” + “World”

• The * operator produces a new tuple, list, or string

• You can’t change a tuple.

>>> li.append(‘a’) # Note the

>>> li.insert(2, ‘i’)

>>> li.reverse() # reverse the list *in place*

>>> li.sort() # sort the list *in place*

• Python shows parens for clarity (best practice)

• Don't forget the comma!

• Trailing comma only required for singletons

>>> li.reverse() # reverse the list in place

>>> li.sort() # sort the list in place