0% found this document useful (0 votes)
11 views

Introduction To Quantitative Data Analysis in Python (I)

Uploaded by

Peter Kremers
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Introduction To Quantitative Data Analysis in Python (I)

Uploaded by

Peter Kremers
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Introduction to Quantitative Data Analysis in Python (I)

1 Python modules for quantitative data analysis


Python has different levels of modularization:
• Package: a package is a folder containing multiple Python modules.
• Module: A module is a code library which can contain function and variable definitions, as
well as valid Python statements.
Common Python modules used for data analysis:
• NumPy: one of the most fundamental packages in Python, NumPy is a general-purpose
array-processing package. NumPy’s main object is the homogeneous multidimensional array,
which a table of elements or numbers of the same datatype, indexed by a tuple of positive
integers.
• SciPy: build upon NumPy and its arrays. It provides efficient mathematical routines as
linear algebra, interpolation, optimization, integration, and statistics
• Pandas: a foundation Python library for Data Science. It provides DataFrames, which
are labeled, two-dimensional data structures (i.e. representing tabular data with rows and
columns) similar to SQL database table or Excel spreadsheet.
• Matplotlib: the most common plotting library for Python to generate basic data visualisa-
tion.
• Seaborn: an extension of Matplotlib with advanced features for plotting attractive and
informative visualisation with less complex and fewer syntax.
• Statsmodels: used for statistical modeling and advanced analysis

2 Python Variables
Variables are containers for storing data values. Rules for Python variables:
• A variable name must start with a letter (conventionally low case letters)
• A variable name cannot start with a number
• A variable name can only contain alpha-numeric characters and underscores (A-z, 0-9, and
_)
• Variable names are case-sensitive

1
3 Python Data Types
Python has the following data types built-in by default, in these categories:
• Numeric Types: int, float, complex
• Text Type: str
• Sequence Types: list, tuple, range
• Mapping Type: dict
• Set Types: set, frozenset
• Boolean Type: bool
• Binary Types: bytes, bytearray, memoryview
[1]: # Python numbers in three types
i1 = 1 # int
f1 = 2.8 # float
f2 = 3.5e3 # "e" to indicate the power of 10.
z = 3+ 5j # complex

# use the type() funciton to verify the type of any object in Python
print(type(i1))
print(f2)
print(type(z))

<class 'int'>
3500.0
<class 'complex'>

[2]: # String literals in python are surrounded by either single quotation marks,
# or double quotation marks.

my_str = "Hello, World!"


print(my_str)

# Use three quotes (either """ or ''') for multiline strings


my_str2 = """Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua."""
print(my_str2, "\n") # \n means starting a new line

# Get the character at a position


# Note that the first character has the position 0
my_str = "Hello, World!"
print("Getting characters in a string: ", my_str[0], my_str[1])

# Slicing a string by specifying the start index and the end index, separated␣
,→by a colon

# to return a part of the string

2
print("Slicing a string: ", my_str[2:5]) # Get characters from position 2 to 5␣
,→(not included)

# The len() function returns the length of a string:


print("Length of a string: ", len(my_str))

# The strip() method removes any whitespace from the beginning or the end
my_str2 = " Hello, World! "
print("Stripping a string: ", my_str2.strip())

# The upper() and lower() method returns the string in upper and lower case,␣
,→respectively

print("Changing a string to upper case: ", my_str.upper())


print("Changing a string to lower case: ", my_str.lower())

# The split() method splits the string into substrings


# if it finds instances of the separator:
print("Splitting a string: ", my_str.split(","))

# The replace() method replaces a string with another string:


print("Replacing a string: ", my_str.replace("H", "J"))

# Use the keywords 'in' or 'not in' to check


# if a certain phrase or character is present in a string
s = "Hello" not in my_str
print("Checking sub-string: ", s)

# Use the '+' operator to concatenate or combine two strings


print("Concatenation of strings: ", "Hello," + " World" + "!")

# Use an escape characterTo insert characters that are illegal in a string.


# Use escape character \ followed the character to be inserted into a string
print("Escaping character in a string: ", "\t Hello, \"World\"! \n")

# The format() method takes the passed arguments, formats them, and places them␣
,→in the string

# where the placeholders {} are.


quantity = 3
itemno = 567
price = 49.95
print("Formatting a string: ",
"I want {} pieces of item {} for {} dollars.".format(quantity, itemno,␣
,→price))

Hello, World!
Lorem ipsum dolor sit amet,
consectetur adipiscing elit,

3
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua.

Getting characters in a string: H e


Slicing a string: llo
Length of a string: 13
Stripping a string: Hello, World!
Changing a string to upper case: HELLO, WORLD!
Changing a string to lower case: hello, world!
Splitting a string: ['Hello', ' World!']
Replacing a string: Jello, World!
Checking sub-string: False
Concatenation of strings: Hello, World!
Escaping character in a string: Hello, "World"!

Formatting a string: I want 3 pieces of item 567 for 49.95 dollars.

[3]: # List is a collection which is ordered and changeable. Allows duplicate␣


,→members.

# lists are written with square brackets.

my_list = ["apple", "banana", "cherry", "orange", "kiwi", "mango"]


print(my_list)

# Access the list items by referring to the index number (0 refer to the first␣
,→item)

# Negative indexing means beginning from the end (-1 refers to the last item)
print("Access an item in a list: ", my_list[0], my_list[-1])

# Access a sub list by specifying a range of indexes [start:end].


# Note 'start' item included, but 'end' item excluded
print("Access sub-list: {}\n \t{}\n \t{}\n".format(my_list[2:4], my_list[:4],␣
,→my_list[4:]))

# Change an item in a list


my_list[1] = "plum"
print("Change an item in a list: ", my_list)

# Use append() method to append an item to the end of list


# Use insert() method to add an item at a specified index
my_list.append("pear")
print("Append an item to the list: ", my_list)
my_list.insert(1, "grape")
print("Add an item to the list:", my_list)

# Use remove() to remove a specified item from the list

4
# Use pop() method removes the item of specified index (or the last item if␣
,→index not specified)

my_list.remove("grape")
print("Remove an item from a list:", my_list)
my_list.pop(1)
print("Remove an item from a list", my_list)

# Use copy() to create a copy of the list


# or use the built-in method list()
my_list1 = my_list.copy()
print("Create a copy of a list:", my_list1)
my_list2 = list(my_list)
print("Create another copy of the list:", my_list2)

# Use '+' operator to join or concatenate, two or more lists


my_list2 = ["peach", "apricot"]
print("Join two lists:", my_list + my_list2)

# Use clear() to empty a list


my_list.clear()
print("Empty a list: ", my_list)

# Delete a list completely


del my_list

['apple', 'banana', 'cherry', 'orange', 'kiwi', 'mango']


Access an item in a list: apple mango
Access sub-list: ['cherry', 'orange']
['apple', 'banana', 'cherry', 'orange']
['kiwi', 'mango']

Change an item in a list: ['apple', 'plum', 'cherry', 'orange', 'kiwi',


'mango']
Append an item to the list: ['apple', 'plum', 'cherry', 'orange', 'kiwi',
'mango', 'pear']
Add an item to the list: ['apple', 'grape', 'plum', 'cherry', 'orange', 'kiwi',
'mango', 'pear']
Remove an item from a list: ['apple', 'plum', 'cherry', 'orange', 'kiwi',
'mango', 'pear']
Remove an item from a list ['apple', 'cherry', 'orange', 'kiwi', 'mango',
'pear']
Create a copy of a list: ['apple', 'cherry', 'orange', 'kiwi', 'mango', 'pear']
Create another copy of the list: ['apple', 'cherry', 'orange', 'kiwi', 'mango',
'pear']
Join two lists: ['apple', 'cherry', 'orange', 'kiwi', 'mango', 'pear', 'peach',
'apricot']
Empty a list: []

5
[4]: # A tuple is a collection which is ordered and unchangeable, or immutable
# Once a tuple is created, you cannot change/add/remove an item
# In Python tuples are written with round brackets

# Create a tuple
my_tuple = ("apple", "banana", "cherry", "durian", "fig", "pear")
print(my_tuple)

# Access an item in a tuple by indexing


# Negative indexing means beginning from the end (-1 refers to the last item)
print("Access an item in a tuple:", my_tuple[1], my_tuple[-2])

# Access a range of items in a tuple by specifying a range of indexes [start:


,→end].

# Note 'start' item included, but 'end' item excluded


print("Access a range of items in a tuple: {}\n \t{}\n \t{}\n"
.format(my_tuple[2:4], my_tuple[:3], my_tuple[3:]))

# Tuple cannot be changed directly, but you can convert the tuple into a list,
# then change the list, and convert the list back into a tuple.
my_tuple = ("apple", "banana", "cherry")
my_list = list(my_tuple)
my_list[1] = "kiwi"
my_tuple = tuple(my_list)

print("Change a tuple:", my_tuple)

('apple', 'banana', 'cherry', 'durian', 'fig', 'pear')


Access an item in a tuple: banana fig
Access a range of items in a tuple: ('cherry', 'durian')
('apple', 'banana', 'cherry')
('durian', 'fig', 'pear')

Change a tuple: ('apple', 'kiwi', 'cherry')

[5]: # A set is a collection which is unordered and unindexed.


# In Python sets are written with curly brackets.

my_set = {"apple", "banana", "cherry"}


print(my_set)

# Items in a set cannot be accessed by referring to an index,


# since sets are unordered the items has no index.
# Instead, loop through the set items using a 'for' loop
for x in my_set:
print("Each item in the set:", x)

6
# Check if a specified value is present in a set, by using the 'in' keyword
print("Check if an item in a set:", "banana" in my_set)

# Once a set is created, its items cannot be changed, but new items can be␣
,→added in.

# Use update() method to add multiple items to a set


my_set.add("orange")
print("Add an item to the set:", my_set)
my_set.update(["orange", "mango", "grapes"])
print("Add multiple items to the set:", my_set)

# Use the remove() or discard() method to remove an item in a set


my_set.remove("banana")
print("Remove an item from a set:", my_set)
# Use discard() method to remove an item.
# If the item to remove does not exist, discard() will NOT raise an error.
my_set.discard("papaya")
print("Remove an item from a set:", my_set)
# Remove the last item by using the pop() method
my_set.pop()
print("Remove the last item from a set:", my_set)
print("Length of the set:", len(my_set))

# Use the union() method that returns a new set containing all items from both␣
,→sets,

my_set1 = {"pear", "fig"}


my_set2 = my_set.union(my_set1)
print("Join two sets:", my_set2)

# Use clear() to empty a set


my_set.clear()
print("Empty a set: ", my_set)

# Delete a set completely


del my_set

{'cherry', 'banana', 'apple'}


Each item in the set: cherry
Each item in the set: banana
Each item in the set: apple
Check if an item in a set: True
Add an item to the set: {'cherry', 'orange', 'banana', 'apple'}
Add multiple items to the set: {'orange', 'cherry', 'grapes', 'banana', 'apple',
'mango'}
Remove an item from a set: {'orange', 'cherry', 'grapes', 'apple', 'mango'}
Remove an item from a set: {'orange', 'cherry', 'grapes', 'apple', 'mango'}

7
Remove the last item from a set: {'cherry', 'grapes', 'apple', 'mango'}
Length of the set: 4
Join two sets: {'grapes', 'pear', 'apple', 'cherry', 'fig', 'mango'}
Empty a set: set()

[6]: # A dictionary is a collection which is unordered, changeable and indexed.


# In Python dictionaries are written with curly brackets, and they have keys␣
,→and values.

my_dict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
print(my_dict)

# Access the items of a dictionary by referring to its key name, inside square␣
,→brackets

# or use the get() method


print("Access the value of a key:", my_dict["model"])
print("Access the value of a key:", my_dict.get("model"))

# Change the value of a specific item by referring to its key name


my_dict["year"] = 2018
print(my_dict)

# loop through a dictionary by using a for loop


# Print all key names in the dictionary, one by one
for x in my_dict:
print("Each key name in the dictionary:", x)

# Use the values() method to return values of a dictionary


for x in my_dict.values():
print("The value of each key in the dictionary:", x)

# Loop through both keys and values, by using the items() method
for x, y in my_dict.items():
print("Each item in the dictionary:", x, y)

# Check if a specified key is present in a dictionary use the 'in' keyword


if "model" in my_dict:
print("Yes, 'model' is one of the keys in the dictionary")

# Add an item to the dictionary is done by using a new index key and assigning␣
,→a value to it

my_dict["color"] = "red"
print("Add an item to the dictionary:", my_dict)

8
# The pop() method removes the item with the specified key name
my_dict.pop("model")
print("Remove an item from the dictionary:", my_dict)

# The popitem() method removes the last inserted item


# in versions before Python 3.7, a random item is removed instead
my_dict.popitem()
print("Remove an item from the dictionary:", my_dict)

# Make a copy of a dictionary with the copy() method:


my_dict1 = my_dict.copy()
print("Create a copy of the dictionary:", my_dict1)

# The clear() method empties the dictionary


my_dict.clear()
print("Clear a dictionary:", my_dict)

# Delete a dictionary completely


del my_dict

{'brand': 'Ford', 'model': 'Mustang', 'year': 1964}


Access the value of a key: Mustang
Access the value of a key: Mustang
{'brand': 'Ford', 'model': 'Mustang', 'year': 2018}
Each key name in the dictionary: brand
Each key name in the dictionary: model
Each key name in the dictionary: year
The value of each key in the dictionary: Ford
The value of each key in the dictionary: Mustang
The value of each key in the dictionary: 2018
Each item in the dictionary: brand Ford
Each item in the dictionary: model Mustang
Each item in the dictionary: year 2018
Yes, 'model' is one of the keys in the dictionary
Add an item to the dictionary: {'brand': 'Ford', 'model': 'Mustang', 'year':
2018, 'color': 'red'}
Remove an item from the dictionary: {'brand': 'Ford', 'year': 2018, 'color':
'red'}
Remove an item from the dictionary: {'brand': 'Ford', 'year': 2018}
Create a copy of the dictionary: {'brand': 'Ford', 'year': 2018}
Clear a dictionary: {}

[7]: # Use following constructor function to specify the data type


x = str("Hello World") # str
x = int(20) # int
x = float(20.5) # float

9
x = complex(1j) # complex
x = list(("apple", "banana", "cherry")) # list
x = tuple(("apple", "banana", "cherry")) # tuple
x = range(6) # range
x = dict(name="John", age=36) # dict
x = set(("apple", "banana", "cherry")) # set
x = frozenset(("apple", "banana", "cherry")) # frozenset
x = bool(5) # bool

4 Function
A function is defined by the keyword def, and can be defined anywhere in Python. It returns the
object in the return statement, typically at the end of the function.
A lambda function is a small anonymous function. A lambda function can take any number of
arguments, but can only have one expression. The power of lambda is better shown when you use
them as an anonymous function inside another function.
[8]: def my_function(fname):
print(fname + " Refsnes")

my_function("Emil")
my_function("Tobias")
my_function("Linus")

# If the number of arguments passed into a function is unknow,


# add a * before the parameter name in the function definition
def my_function(*kids):
print("The youngest child is " + kids[2])

my_function("Emil", "Tobias", "Linus")

# Arguments can be sent to a function with the key = value syntax.


# This way the order of the arguments does not matter
def my_function(child3, child2, child1):
print("The youngest child is " + child3)

my_function(child1 = "Emil", child2 = "Tobias", child3 = "Linus")

# If the number of keyword arguments passed into your function is unknown,


# add two asterisk: ** before the parameter name in the function definition
def my_function(**kid):
print("His last name is " + kid["lname"])

my_function(fname = "Tobias", lname = "Refsnes")

10
# A lambda function that multiplies argument a with argument b and print the␣
,→result:

x = lambda a, b : a * b
print("Calling a lambda function:", x(5, 6))

# A function definition that takes one argument, and


# that argument will be multiplied with an unknown number
def myfunc(n):
return lambda a : a * n

mydoubler = myfunc(2)
print("Calling a lamdba fuction inside a function: ", mydoubler(11))

Emil Refsnes
Tobias Refsnes
Linus Refsnes
The youngest child is Linus
The youngest child is Linus
His last name is Refsnes
Calling a lambda function: 30
Calling a lamdba fuction inside a function: 22

11

You might also like