DataScienceWithPython Ed2018
DataScienceWithPython Ed2018
EUGENIA BAHIT
DATA SCIENCE
WITH PYTHON
STUDY MATERIAL
SUMMARY
VARIABLE MANIPULATION METHODS...........................................................................................................5
STRING MANIPULATION.....................................................................................................................................5
FORMATTING METHODS.................................................................................................................................5
CAPITALIZE THE FIRST LETTER................................................................................................................5
CONVERT A STRING TO LOWERCASE.....................................................................................................5
CONVERT A STRING TO UPPERCASE.......................................................................................................6
CONVERT UPPERCASE TO LOWERCASE AND VICE VERSA...............................................................6
CONVERT A STRING TO TITLE FORMAT.................................................................................................6
CENTER A TEXT.............................................................................................................................................6
ALIGN TEXT TO THE LEFT..........................................................................................................................6
ALIGN TEXT TO THE RIGHT.......................................................................................................................7
FILL IN A TEXT BY PREFIXING IT WITH ZEROS....................................................................................7
RESEARCH METHODS......................................................................................................................................7
COUNT NUMBER OF OCCURRENCES OF A SUBSTRING......................................................................7
SEARCH FOR A SUBSTRING WITHIN A STRING.....................................................................................7
VALIDATION METHODS..................................................................................................................................8
TO KNOW IF A STRING BEGINS WITH A GIVEN SUBSTRING.............................................................8
TO KNOW IF A STRING ENDS WITH A GIVEN SUBSTRING.................................................................8
TO KNOW IF A STRING IS ALPHANUMERIC...........................................................................................8
TO KNOW IF A STRING IS ALPHABETIC..................................................................................................8
TO KNOW IF A STRING IS NUMERIC.........................................................................................................9
TO KNOW IF A STRING CONTAINS ONLY LOWERCASE LETTERS....................................................9
TO KNOW IF A STRING CONTAINS ONLY UPPERCASE LETTERS......................................................9
TO KNOW IF A STRING CONTAINS ONLY BLANKS............................................................................10
TO KNOW IF A STRING HAS A TITLE FORMAT....................................................................................10
SUBSTITUTION METHODS............................................................................................................................10
FORMATTING A STRING, DYNAMICALLY SUBSTITUTING TEXT...................................................10
REPLACE TEXT IN A STRING....................................................................................................................11
REMOVE CHARACTERS TO THE LEFT AND RIGHT OF A STRING...................................................11
REMOVE CHARACTERS TO THE LEFT OF A STRING..........................................................................11
REMOVE CHARACTERS TO THE RIGHT OF A STRING.......................................................................11
JOINING AND SPLITTING METHODS..........................................................................................................11
ITERATIVELY JOIN A CHAIN....................................................................................................................11
-2-
-3-
-4-
each object. Methods are functions but they are derived from a variable. Therefore, these functions are accessed
variable.function()
In some cases, these methods (functions of an object) will accept parameters like any other function.
variable.function(parameter)
STRING MANIPULATION
The main methods that can be applied to a text string, organized by category, are described below.
FORMATTING METHODS
-5-
hello world
CENTER A TEXT
Method: center(length[, "fill character"])
Returns: a copy of the centered string
> >> string = "welcome to my application".capitalize()
> >> string.center(50, "=")
===========Welcome to my application============
-6-
RESEARCH METHODS
-7-
VALIDATION METHODS
-8-
> >> string = "pepegrillo" >> string = "pepegrillo" >> string = "pepegrillo" >> string = "pepegrillo
> >> string.isalpha()
True
> >> string = "pepegrillo75".
> >> string.isalpha()
False
-9-
> >> string = "Pepegrillo" >> string = "Pepegrillo" >> string = "Pepegrillo" >> string = "Pepegrillo
> >> string.isupper()
False
> >> string = "PEPEGRILLO".
> >> string.isupper()
True
SUBSTITUTION METHODS
> >> string = "Gross Amount: ${0} + VAT: ${1} = Net Amount: {2}"
> >> string.format(100, 21, 121)
Gross amount: $100 + VAT: $21 = Net amount: 121
> >> string = "Gross amount: ${gross} + VAT: ${VAT} = Net amount: {net}"
> >> string.format(gross=100, vat=21, net=121)
Gross amount: $100 + VAT: $21 = Net amount: 121
- 10 -
- 11 -
- 12 -
AGGREGATION METHODS
ELIMINATION METHODS
>>> male_names
['Ricky', 'Alvaro', 'David', 'Jacinto', 'Jose', 'Ricky', 'Jose', 'Jose', 'Jose'].
ORDER METHODS
RESEARCH METHODS
- 15 -
>>> male_names.index("Miguel", 2, 5) 4
TYPE CONVERSION
In the set of Python built-in functions, it is possible to find two functions that allow you to convert lists into
tuples, and vice versa. These functions are list and tuple, to convert tuples to lists and lists to tuples, respectively.
One of the most frequent uses is the conversion of tuples to lists, which need to be modified. This is often the
>>> list(tuple)
[1, 2, 3, 4]
>>> tuple(list)
(1, 2, 3, 4)
- 16 -
CONCATENATION OF COLLECTIONS
You can concatenate (or join) two or more lists or two or more tuples, by means of the addition sign +.
You cannot join a list to a tuple. The collections to be joined must be of the same type.
- 17 -
- 18 -
RETURN METHODS 24
GET THE VALUE OF A KEY 24
TO KNOW IF A KEY EXISTS IN THE DICTIONARY 24
OBTAIN THE KEYS AND VALUES OF A DICTIONARY 24
OBTAIN THE KEYS TO A DICTIONARY 24
GET THE VALUES OF A DICTIONARY 25
OBTAIN THE NUMBER OF ITEMS IN A DICTIONARY 25
FILE HANDLING AND MANIPULATION 27
WAYS TO OPEN A FILE 27
SOME METHODS OF THE FILE OBJECT 29
CSV FILE HANDLING 30
SOME EXAMPLES OF CSV FILES 30
WORKING WITH CSV FILES FROM PYTHON 32
READING CSV FILES 32
WRITING CSV FILES 37
PROBABILITY AND STATISTICS WITH PYTHON 40
PROBABILITY OF MUTUALLY EXCLUSIVE SIMPLE AND COMPOUND EVENTS IN PYTHON 40
SAMPLE SPACE 40
SIMPLE AND COMPOUND EVENTS 40
PROBABILITY ASSIGNMENT 41
SIMPLE MUTUALLY EXCLUSIVE EVENTS 41
EVENTS COMPOSED OF MUTUALLY EXCLUSIVE SIMPLE EVENTS 42
FUNCTIONS 43
CONDITIONAL PROBABILITY IN PYTHON 43
FUNCTIONS 44
DEPENDENT EVENTS 44
SET THEORY IN PYTHON 46
INDEPENDENT EVENTS 46
BAYES THEOREM IN PYTHON 47
BAYES' THEOREM AND PROBABILITY OF CAUSES 47
DATA: CASE STUDY 47
ANALYSIS 48
PROCEDURE 49
FUNCTIONS 54
COMPLEMENTARY BIBLIOGRAPHY 54
ANNEX I: COMPLEX CALCULATIONS 60
POPULATION AND SAMPLING STATISTICS: CALCULATION OF 60
- 19 -
COUNT ITEMS
The len() function is used to count elements in a list or tuple, as well as characters in a text string:
- 20 -
EMPTY A DICTIONARY
Method: clear()
>>> dictionary = {"color": "violet", "size": "XS", "price": 174.25}
> >> dictionary
{'color': 'violet', 'price': 174.25, 'size': 'XS'}
COPY A DICTIONARY
Method: copy()
> >> dictionary = {"color": "violet",
> >> t-shirt = dictionary.copy()
> >> dictionary
{'color': 'violet', 'price': 174.25,
>>> remera.clear()
>>> T-shirt {} >>> T-shirt {}
174.25, 'size': 'XS'}
CONCATENATE DICTIONARIES
Method: update(dictionary)
>>> dictionary1 = {"color": "green", "price": 45}
>>> dictionary2 = {"size": "M", "brand": "Lacoste"}
>>> dictionary1.update(dictionary2)
>>> dictionary1
{'color': 'green', 'price': 45, 'brand': 'Lacoste', 'size': 'M'}
If the key does not exist, it creates it with the default value. Always returns the value for the key passed as
parameter.
- 23 -
>>> t-shirt2
{'color': 'pink', 'print': None, 'brand': 'Zara', 'size': 'U'}
RETURN METHODS
>>> remera.get("stock")
>>> t-shirt.get("stock", "no stock")
'out of stock
- 24 -
- 25 -
One of them is through the os module, which facilitates the work with the entire file and directory
The second level is the one that allows working with files by manipulating their reading and writing from the
for? The answers can be several: to read, to write, or to read and write.
This pointer will position a cursor (or access point) at a specific location in memory (more simply put, it will
This cursor will move within the file as the file is read or written to.
When a file is opened in read mode, the cursor is positioned at byte 0 of the file (i.e. at the beginning of the file).
Once the file has been read, the cursor moves to the final byte of the file (equivalent to the total number of bytes
in the file). The same happens when it is opened in write mode. The cursor will move as you type.
When you want to write to the end of a non-null file, the append mode is used. In this way, the file is opened
with the cursor at the end of the file.
The + symbol as a mode suffix adds the opposite mode to the opening mode once the opening action is executed.
For example, the r (read) mode with the suffix + (r+), opens the file for reading, and after reading, returns the
cursor to byte 0.
a+ Added (add content) and read. If the file exists, at the end of
- 28 -
this one.
Create the file if it does not exist. If the file does not exist, at the
beginning.
If the file exists, at the end of the
Added (add content) and read in binary file.
ab+ mode. If the file does not exist, at the
Create the file if it does not exist. beginning.
Method Description
Reads the entire contents of a file.
read([bytes]) If the byte length is passed, it will read only the contents up
to the specified length.
readlines() Reads all lines of a file
ACCESSING FILES THROUGH THE WITH STRUCTURE With the with structure and the
open() function, you can open a file in any mode and work with it, without having to close it or destroy the
pointer, as this is taken care of by the with structure.
Read a file:
Write to a file:
content = """
This will be the content of the new file.
The file will have several lines.
- 29 -
"""
text files, intended for massive data storage. It is one of the simplest formats for data analysis. In fact, many non-
free (or free but more complex) file formats are often converted to CSV format to apply complex data science
A CSV file consists of a header that defines column names, and the following rows have the data corresponding
to each column, separated by a comma. However, many other symbols can be used as cell separators. Among
them, the tab and the semicolon are just as frequent as the comma.
ID;DATA;VV;DV;T;HR;PPT;RS;P
0;2016-03-01 00:00:00;;;9.9;73;;;
1;2016-03-01 00:30:00;;;9.0;67;;;
2;2016-03-01 01:00:00;;;8.3;64;;;
3;2016-03-01 01:30:00;;;8.0;61;;;
4;2016-03-01 02:00:00;;;7.4;62;;;
5;2016-03-01 02:30:00;;;8.3;47;;;
6;2016-03-01 03:00:00;;;7.7;50;;;
7;2016-03-01 03:30:00;;;9.0;39;;;
- 30 -
Maria,858,1930
Jose,665,1930
Rosa,591,1930
Juan Carlos,522,1930
Antonio,509,1930
Maria Esther,495,1930
Maria Luisa,470,1930
Joan,453,1930
John,436,1930
Companies registered with the General Inspectorate of Justice of Argentina (separated by , and data in
quotation marks)
It is also possible to find data stored in text files (TXT) with formats very similar to what you would expect to
find in a CSV. Sometimes it is possible to develop a formatting script to correct these files to work with a CSV.
- 31 -
and writing.
This module is used in combination with the with structure and the open function to read or generate the file, and
0;2016-03-01 00:00:00;;;9.9;73;;;
1;2016-03-01 00:30:00;;;9.0;67;;;
2;2016-03-01 01:00:00;;;8.3;64;;;
3;2016-03-01 01:30:00;;;8.0;61;;;
4;2016-03-01 02:00:00;;;7.4;62;;;
5;2016-03-01 02:30:00;;;8.3;47;;;
6;2016-03-01 03:00:00;;;7.7;50;;;
7;2016-03-01 03:30:00;;;9.0;39;;;
8;2016-03-01 04:00:00;;;8.7;39;;;
with open("file.csv", "r") as file: document = reader(file, delimiter=';', for row in document:
' '.join(row) quotechar='"')
Output:
- 32 -
When the CSV file has a header, it is necessary to skip the header:
ID;DATA;VV;DV;T;HR;PPT;RS;P
0;2016-03-01 00:00:00;;;9.9;73;;;
1;2016-03-01 00:30:00;;;9.0;67;;;
2;2016-03-01 01:00:00;;;8.3;64;;;
3;2016-03-01 01:30:00;;;8.0;61;;;
4;2016-03-01 02:00:00;;;7.4;62;;;
5;2016-03-01 02:30:00;;;8.3;47;;;
6;2016-03-01 03:00:00;;;7.7;50;;;
7;2016-03-01 03:30:00;;;9.0;39;;;
8;2016-03-01 04:00:00;;;8.7;39;;;
Output:
- 33 -
- 34 -
TYPE CONVERSION 16
CONCATENATION OF COLLECTIONS 17
MAXIMUM AND MINIMUM VALUE 20
COUNT ITEMS 20
DICTIONARY MANIPULATION 22
ELIMINATION METHODS 22
EMPTY A DICTIONARY 22
AGGREGATION AND CREATION METHODS 22
COPY A DICTIONARY 22
CREATE A NEW DICTIONARY FROM THE KEYS OF AN EXISTING DICTIONARY 23
SEQUENCE 23
CONCATENATE DICTIONARIES 23
SET A DEFAULT KEY AND VALUE 23
RETURN METHODS 24
GET THE VALUE OF A KEY 24
TO KNOW IF A KEY EXISTS IN THE DICTIONARY 24
OBTAIN THE KEYS AND VALUES OF A DICTIONARY 24
OBTAIN THE KEYS TO A DICTIONARY 24
GET THE VALUES OF A DICTIONARY 25
OBTAIN THE NUMBER OF ITEMS IN A DICTIONARY 25
FILE HANDLING AND MANIPULATION 27
WAYS TO OPEN A FILE 27
SOME METHODS OF THE FILE OBJECT 29
CSV FILE HANDLING 30
SOME EXAMPLES OF CSV FILES 30
WORKING WITH CSV FILES FROM PYTHON 32
READING CSV FILES 32
WRITING CSV FILES 37
PROBABILITY AND STATISTICS WITH PYTHON 40
PROBABILITY OF MUTUALLY EXCLUSIVE SIMPLE AND COMPOUND EVENTS IN
PYTHON 40
SAMPLE SPACE 40
SIMPLE AND COMPOUND EVENTS 40
PROBABILITY ASSIGNMENT 41
SIMPLE MUTUALLY EXCLUSIVE EVENTS 41
- 35 -
Another way to read CSV files with headers is to use the DictReader object instead of the reader,
and thus access only the value of the desired columns by name:
with open("file.csv", "r") as file: document = DictReader(file, delimiter=';', for row in document:
row['DATA']] quotechar='"')
- 36 -
Output:
'2016-03-01 00:00:00'
'2016-03-01 00:30:00'
'2016-03-01 01:00:00'
'2016-03-01 01:30:00'
'2016-03-01 02:00:00'
'2016-03-01 02:30:00'
'2016-03-01 03:00:00'
'2016-03-01 03:30:00'
'2016-03-01 04:00:00'
In the above example, an array could be a list of lists with equal number of elements. For example:
matrix = [
This would generate a file named data.csv with the following content:
- 37 -
In this case, the matrix to be written will need to be a list of dictionaries whose keys match the
indicated headers.
matrix = [
dict(player='Juan', points=373, year=1970), dict(player='Ana', points=124, year=1983),
dict(player='Pedro', points=901, year=1650), dict(player='Rosa', points=300, year=2000), dict(player='Juana',
points=75, year=1975), ] from csv import DictWriter
Simple statistical functions such as the following can be performed on lists and tuples obtained or
- 38 -
E=(1,2,3,4,5,6)
sample_space = [1, 2, 3, 4, 5, 6].
Each element in a sample space is referred to as a sample point . The number of sample points is
n = len(monthly_space)
• the probability that the number 5 comes out in this throw, is a simple event A = {5} and is
exclusive: if 5 comes out, no other number can simultaneously come out.
PROBABILITY ASSIGNMENT
Probability assignment is that which provides mathematical models to calculate the chances of
• simple or compound
• mutually exclusive or independent
P(A,) = -
n
probability = 1.0 / n
In Python, at least one element of the equation is required to be a real number if what is
required as a result is a real number.
The probability of each sample point, as mutually exclusive events, is the same for each event.
- 41 -
of the composite event will be given by the sum of the probabilities of each simple event P(Ak) ,
such that:
P(A) = P(A1)-P(A2)-.-P(Ak)
For example, to estimate the probability that a single throw of a die will produce an even number,
3
In the first result 6 ( in the second step, before finding the maximum common
1
divisor [DCM] and reduce the fraction to 2 ) , the denominator is equivalent to the number of single
events within the composite event "even numbers" and is denoted by h. The denominator, 6 , is n,
the total of all events in the sample space. Thus, the probability of an event composed A by
- 42 -
A composite event can be denoted by the union of its simple events (symbol u , read as "o"), such
that:
For example, for the case of the event "even numbers", it is obtained that:
P(2U4U 6) = P(2)+P(4)+P(6)
P(2U4U6) = 1++1+1 =8
P(2 U 4 U 6) = | |
FUNCTIONS
# Probability of mutually exclusive simple events pssme = lambda e: 1.0 / len(e) # Probability of mutually
exclusive compound events def pscme(e, sc):
n = len(e)
return len(sc) / float(n)
d. Probability of intersection:
P(ARB) = P(A)P(B)
P(A n B) = 1 I
P(A n B) = I
# probability of A
a = [i for i in e if i % 2 is not 0] pa = len(a) / float(n)
- 43 -
# probability of B
b = [i for i in e if i % 2 is 0] b = [i for i in e if i % 2 is 0] b = [i for i in e if i % 2 is 0
pb = len(b) / float(n)
FUNCTIONS
# Conditional probability: dependent events def pscd(e, a, b):
i = list(set(a).intersection(b))
pi = pscme(e, i)
pa = pscme(e, a)
return pi / pa
DEPENDENT EVENTS
Refers to the probability of two events occurring simultaneously where the second event depends on
The probability of B occurring if A occurs is denoted by P(BA) and is read as "the probability of B
P(BI^ _1
■ P(A)
Where PA n B)
is the probability of the intersection of the events of AandB
composed of simple events. In the following example, it would equal 11,3} (because 1 and 3 are in
both A and B ).
Example: what is theprobability of rolling a die with an odd number less than 4?
- 44 -
The throwing of the die is an event in itself. We wish to find the probability of B = {1.2.3} (number
less than 4) given that A=(1,3,5 (odd number) occurred in the sample space E = {1.2. 3. 4,5,6} .
An ={1,3}
intersec = [i for i in a if i in b]
112 1
P(AnB)=P(1)+P(3)=+=é=, b0O3
poand_1-2_1
n6 3
p_I_2_1
Finally, it is obtained that:
P(B|A) = P427
P(B|.A) = 1/2
P(B-A) =5=0.6
- 45 -
# probability of intersection
probability_intersec = float(hintersec) / n
# probability of 'a
probability_a = float(ha) / n
# conditional probability
probability_b_given_a = probability_intersec / probability_a
saying: return 'i' for each 'i' in list 'a' if it is in list 'b'.
However, since each compound event is a set and Python provides a data type called set, it is
possible to obtain the intersection by manipulating compound events as Python sets. With set you
can convert any iterable to a set and perform set operations such as union and intersection when
Here the set obtained is converted into a list in order to be consistent with the rest of the code and to
ensure that the resulting element supports the usual operations and processing of a list. When in
doubt as to whether to use lists or sets, the principle of simplicity should be applied and the simplest
INDEPENDENT EVENTS
Unlike the previous case, here the probability of occurrence of B is not affected by the occurrence of
- 46 -
and obtain an even number (event B) is not affected by the fact that an odd number was obtained in
a previous throw (event A). The probability of B is independent of A and is given by the product of
the probability of both events:
P(AnB) = P(A)P(B)
Once the probability of both independent events is calculated, they are multiplied obtaining:
E = {1,2,3,4,5,0}
b. Probability of A:
.4= {1,3,5}
P(A) = h = 2 = 1
allows us to know the probability that each event Ak of E is the cause of B. For this reason, it is also
- 47 -
The aim is to obtain the probability that the cause of contracting influenza is the fact of belonging to
a certain demographic sector (for example, the demographic sector made up of boys or girls).
ANALYSIS
From what has been stated above, it follows that:
• The value of n is taken as the sum of the sample space 2 Aa , such that
n = 50000
• The value of h for the events Ak is each of the values given in the population distribution
table.
- 48 -
• The probability of being a girl, boy, woman or man in the city, by P(Ak) . It is considered
an a priori probability.
• The probability of being a girl, boy, woman or man and having influenza, which is obtained
with P(Ak B) and is considered a conditional probability.
• The probability that any inhabitant, regardless of the sector to which he or she belongs, will
have the flu is obtained with
n
P(B)=>P(A)P(BA,)
k=1 and is considered a total probability.
• The probability that someone with influenza is a girl, boy, woman or man is obtained with
Bayes' Theorem. This probability is considered an a posteriori probability, allowing us to
answer questions such as: Whatis the probability that a new case of influenza will be in a
child?
An efficient and orderly way to obtain an a posteriori probability with Bayes' Theorem is to first
NOTICE:
In the following, map(float, <list>) will be used in the source code to convert the elements of
a list into real numbers, as long as doing so does not overload the code.
PROCEDURE
1. A priori probability calculation
- 49 -
Formula:
Data required:
Results:
m) = = 0.22
50000 probability of begirl
9000
P(A2) = 50000
= 0.18
probability of bechild
16000
= 0.32
50000
probability of bewoman
14000 = 0.28
50000 probability of beman
Python code:
2. Conditional probability
- 50 -
P(AknB)
P(BAk)= P(A,)
Formula:
Data required:
h=B,
P(Ak n B) =
h = intersections (data from the table of distribution of influenza cases)
Results:
. .....
P(BA1)= WOOL1
0 18 0.22 ' probability of having the flu as a child
IWO
P(BI A) = 50000 = 0.16
0.18
probability of getting the flu as a child
3000
P(BI A3) = woou __0 19
0.32 '
probability of having the flu as a woman
2500
P(BI A) = WOOL __0 18
0.28 ' probability of getting the flu as a man
Python code:
3. Total probability
Returns: probability that any of the inhabitants, regardless of the demographic sector to which they
- 51 -
n
P(B) = >P(A,) P(BA,)
Formula: k=1
Data required:
a priori probability
conditional probability
Results:
P(B) = 0.18
Python code:
Remarks:
(a) note that in the above output there will be a difference of .01 with respect to the manual
solution. This is due to the rounding performed in the manual solution. This difference can be
eradicated by using 3 decimal places in the conditional probability values (instead of two) in the
manual solution.
- 52 -
(b) the probability of NOT having the flu will be given by 1 - P(B'l such that
1 -0.18 = 0.82 but it will not be necessary to use it for this example with the
Bayes theorem.
4. A posteriori probability
Returns: probability of belonging to a specific demographic sector and having the flu.
_ . 2P(A,)P(BA+)
Formula: k=1
Data required:
PAk) P(BAk,
= the product obtained in each of the terms of total probability
2P(AL)P(B|A)
k=1 = the total probability
Results:
= 0.22
probability of being a girl having the flu
0.03
P(A2B)= 0.18
= 0.16
probability of being a child having the flu
- 53 -
0.06 —
P(A3B)= = 0.33
0.18 probability of being a woman having the flu
0.05 -
P(A4B)= = 0.27
0.18 probability of being a man having the flu
Python code:
FUNCTIONS
# Bayes' Theorem
def bayes(e, b):
n = float(sum(e))
pa = [h / n for h in e].
pi = [k / n for k in b].
pba = [pi[i] / pa[i] for i in range(len(pi))].
prods = [pa[i] * pba[i] for i in range(len(pa))]]
ptb = sum(prods)
pab = [p / pb for p in prods].
return pab
COMPLEMENTARY BIBLIOGRAPHY
[0] Probability and Statistics, Murray Spiegel. McGraw-Hill, Mexico 1988. ISBN: 968-451-102-7
- 54 -
samples = [12, 23, 24, 24, 22, 10, 17] # sample list
n = len(samples)
average = sum(samples) / float(n)
Media
Population variance
2 _ H(,2)2
" n
Sample variance
for n in samples:
if not n in absolutes:
absolute.append(n)
fi = samples.count(n)
frequencies.append(fi)
N = sum(frequencies) # == len(samples)
# RELATIVE FREQUENCY
# Quotient between absolute frequency and relative N = [float(fi) / N for fi in frequencies] sumarelative =
round(sum(relative)) # == 1
# CUMULATIVE FREQUENCY
# Sum of all frequencies less than or equal to the absolute frequency frequencies.sort()
cumulative = [sum(frequencies[:i+1]) for i, fi in enumerate(frequencies)]]
- 61 -
the option chosen by the user. Here is a trick to solve this in a simple and ingenious way.
2) Secondly, it is necessary that all functions have their corresponding documentation, defining
def read_file():
"Read CSV file"""""
return "read"
def write_file():
"""Write CSV file"""
return "write"
def _sum_numbers(list):
"""Add the numbers in a list""" return "private""
3) Next, a list is defined with the name of all the functions that will be accessible by the user from
the menu:
The trick is to automate both the generation of the menu and the function call.
echo(menu)
option = int(get("Your option: "))
# echo and get: hacks learned in the introductory course
Finally, to dynamically access the function chosen by the user, the trick is to use the option chosen
by the user, as an index to access the function name from the list, and again resort to locals to invoke
the function:
- 64 -
If you reached the end of the course you can obtain a triple certification:
If you need to prepare for your exam, you can register in the
Data Science with Python course at
Escuela de Informática Eugenia Bahit
www.eugeniabahit.com www.eugeniabahit.com