# Introduction to Python 

## Python Code Optimization

References: 
http://cython.readthedocs.io/en/latest/src/tutorial/cython_tutorial.html 
https://numba.pydata.org/ 
https://gist.github.com/jfpuget/b53f1e15a37aba5944ad 

In [52]:
import random
import time
import glob
import os
import math

Let's create a recursive Fibonacci function and test with diferent parameters:

In [2]:
def fib(n):
 if n < 2:
 return 1
 return fib(n-1) + fib(n-2)

In [3]:
%timeit fib(20)

2.12 ms ± 50.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [4]:
%timeit fib(30)

262 ms ± 8.07 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [5]:
%timeit fib(33)

1.13 s ± 15 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


#### Now, let's compile with Cython: 

+ pip3 install Cython

In [6]:
%load_ext Cython

In [7]:
%%cython

def fib_cython(n):
 if n < 2:
 return 1
 return fib_cython(n-1) + fib_cython(n-2)

In [8]:
%timeit fib_cython(20)

649 µs ± 7.94 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [9]:
%timeit fib_cython(30)

78.9 ms ± 1.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [10]:
%timeit fib_cython(33)

347 ms ± 8.97 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


Podemos melhorar ainda mais compilando as variáveis estaticamente. 
A função deve ser declarada com "cpdef" ao invés de def". Dessa forma, podemos usar os [tipos do C](https://www.tutorialspoint.com/cprogramming/c_data_types.htm) como parâmetros da função: 

In [11]:
%%cython

cpdef long fib_cython_type(long n):
 if n < 2:
 return 1
 return fib_cython_type(n-1) + fib_cython_type(n-2)

In [12]:
%timeit fib_cython_type(20)

18.1 µs ± 592 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [13]:
%timeit fib_cython_type(30)

2.19 ms ± 69.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [14]:
%timeit fib_cython_type(33)

9.35 ms ± 303 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


Uma outra abordagem é permitindo que os resultados sejam guardados em _cache_: 
Dessa forma podemos manter a precisão arbitrária das variáveis do Python, sem recorrer aos [tipos estáticos do C](https://notes-on-cython.readthedocs.io/en/latest/function_declarations.html)

In [15]:
from functools import lru_cache as cache #Python3

In [16]:
@cache(maxsize=None)
def fib_cache(n):
 if n < 2:
 return 1
 return fib_cache(n-1) + fib_cache(n-2)

In [17]:
%timeit fib_cache(20)

67.1 ns ± 1.27 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [18]:
%timeit fib_cache(30)

66.1 ns ± 1.2 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [19]:
%timeit fib_cache(33)

66 ns ± 0.93 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


Uma outra maneira de emular o cache seria realizando uma modificação no algoritmo:

In [20]:
def fib_seq(n):
 if n < 2:
 return 1
 a,b = 1,0
 for i in range(n-1):
 a, b = a + b, a
 return a

In [21]:
%timeit fib_seq(20)

965 ns ± 22.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [22]:
%timeit fib_seq(30)

1.37 µs ± 37.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [23]:
%timeit fib_seq(33)

1.48 µs ± 47.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


Vamos compilar esta nova função com e sem a utilização de tipagem estática: 

In [24]:
%%cython

def fib_seq_cython(n):
 if n < 2:
 return 1
 a,b = 1,0
 for i in range(n-1):
 a, b = a + b, a
 return a

In [25]:
%timeit fib_seq_cython(20)

535 ns ± 10.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [26]:
%timeit fib_seq_cython(30)

771 ns ± 21.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [27]:
%timeit fib_seq_cython(33)

800 ns ± 14.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [28]:
%%cython

cpdef long fib_seq_cython_type(long n):
 if n < 2:
 return 1
 cdef long a,b
 a,b = 1,0
 for i in range(n-1):
 a, b = a+b, a
 return a

In [29]:
%timeit fib_seq_cython_type(20)

64 ns ± 3.41 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [30]:
%timeit fib_seq_cython_type(30)

67.2 ns ± 4.07 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [31]:
%timeit fib_seq_cython_type(33)

69.6 ns ± 0.903 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


Podemos usar uma ferramenta chamada Numba, que é um just-in-time (jit) compiler:

In [32]:
from numba import jit

In [39]:
@jit
def fib_numba(n):
 if n < 2:
 return 1
 return fib_numba(n-1) + fib_numba(n-2)

In [40]:
%timeit fib_numba(20)

74 µs ± 2.64 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [42]:
%timeit fib_numba(30)

9.18 ms ± 368 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [41]:
%timeit fib_numba(33

The following iterative sequence is defined for the set of positive integers:

n → n/2 (n is even)
n → 3n + 1 (n is odd)

Using the rule above and starting with 13, we generate the following sequence:
13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1

It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.

Which starting number, under one million, produces the longest chain?

NOTE: Once the chain starts the terms are allowed to go above one million.
)

38.2 ms ± 964 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [33]:
@jit
def fib_seq_numba(n):
 if n < 2:
 return 1
 a,b = 1,0
 for i in range(n-1):
 a, b = a + b, a
 return a

In [34]:
%timeit fib_seq_numba(20)

194 ns ± 10.5 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [35]:
%timeit fib_seq_numba(30)

203 ns ± 3.61 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [36]:
%timeit fib_seq_numba(33)

194 ns ± 10.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


Segundo exemplo: 
Algoritmo de ordenação Quick Sort 

In [43]:
def qsort_kernel(a, lo, hi):
 i = lo
 j = hi
 while i < hi:
 pivot = a[(lo+hi) // 2]
 while i <= j:
 while a[i] < pivot:
 i += 1
 while a[j] > pivot:
 j -= 1
 if i <= j:
 a[i], a[j] = a[j], a[i]
 i += 1
 j -= 1
 if lo < j:
 qsort_kernel(a, lo, j)
 lo = i
 j = hi
 return a

def benchmark_qsort():
 lst = [random.random() for i in range(1,5000)]
 qsort_kernel(lst, 0, len(lst)-1)

Usando Numba:

In [44]:
@jit
def qsort_kernel_numba(a, lo, hi):
 i = lo
 j = hi
 while i < hi:
 pivot = a[(lo+hi) // 2]
 while i <= j:
 while a[i] < pivot:
 i += 1
 while a[j] > pivot:
 j -= 1
 if i <= j:
 a[i], a[j] = a[j], a[i]
 i += 1
 j -= 1
 if lo < j:
 qsort_kernel_numba(a, lo, j)
 lo = i
 j = hi
 return a

def benchmark_qsort_numba():
 lst = [random.random() for i in range(1,5000)]
 qsort_kernel_numba(lst, 0, len(lst)-1)

Numba e Numpy:

In [45]:
@jit
def qsort_kernel_numba_numpy(a, lo, hi):
 i = lo
 j = hi
 while i < hi:
 pivot = a[(lo+hi) // 2]
 while i <= j:
 while a[i] < pivot:
 i += 1
 while a[j] > pivot:
 j -= 1
 if i <= j:
 a[i], a[j] = a[j], a[i]
 i += 1
 j -= 1
 if lo < j:
 qsort_kernel_numba_numpy(a, lo, j)
 lo = i
 j = hi
 return a

def benchmark_qsort_numba_numpy():
 lst = np.random.rand(5000)
 qsort_kernel_numba(lst, 0, len(lst)-1)

Usando Numpy e Cython: 

In [50]:
%%cython
import numpy as np
import cython

@cython.boundscheck(False)
@cython.wraparound(False)
cdef double[:] qsort_kernel_cython_numpy_type(double[:] a, long lo, long hi):
 cdef: 
 long i, j
 double pivot
 i = lo
 j = hi
 while i < hi:
 pivot = a[(lo+hi) // 2]
 while i <= j:
 while a[i] < pivot:
 i += 1
 while a[j] > pivot:
 j -= 1
 if i <= j:
 a[i], a[j] = a[j], a[i]
 i += 1
 j -= 1
 if lo < j:
 qsort_kernel_cython_numpy_type(a, lo, j)
 lo = i
 j = hi
 return a

def benchmark_qsort_numpy_cython():
 lst = np.random.rand(5000)
 qsort_kernel_cython_numpy_type(lst, 0, len(lst)-1)

def benchmark_sort_numpy():
 lst = np.random.rand(5000)
 np.sort(lst)

In [51]:
%timeit benchmark_qsort()
%timeit benchmark_qsort_numba()
%timeit benchmark_qsort_numba_numpy()
%timeit benchmark_qsort_numpy_cython()
%timeit benchmark_sort_numpy()

9.32 ms ± 103 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
7.73 ms ± 414 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
414 µs ± 10.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
468 µs ± 20.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
290 µs ± 9.35 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


## Multithreading

### Módulo [Threading](https://www.tutorialspoint.com/python3/python_multithreading.htm) 

In [2]:
import threading

In [3]:
exitFlag = 0

In [4]:
class myThread(threading.Thread):
 def __init__(self, threadID, name, counter, delay):
 threading.Thread.__init__(self)
 self.threadID = threadID
 self.name = name
 self.counter = counter
 self.delay = delay
 
 def run(self):
 print ("Starting {}...\n ".format(self.name))
 print_time(self.name, self.counter, self.delay)
 print ("Exiting {}... \n ".format(self.name))

def print_time(threadName, counter, delay):
 while counter:
 if exitFlag:
 threadName.exit()
 time.sleep(delay)
 print ("{}: {}\n".format(threadName, time.ctime(time.time())))
 counter -= 1

In [5]:
# Create new threads
thread1 = myThread(1, "Thread-1", 5, 1)
thread2 = myThread(2, "Thread-2", 5, 2)
thread3 = myThread(3, "Thread-3", 5, 2)

In [6]:
# Start new Threads
thread1.start() #chama o método run
thread2.start()
thread3.start()
thread1.join() #aguarda que o thread termine
thread2.join()
thread3.join()
print ("Exiting Main Thread")

Starting Thread-1...
 
Starting Thread-2...
 
Starting Thread-3...
 
Thread-1: Mon Oct 22 15:49:06 2018

Thread-1: Mon Oct 22 15:49:07 2018
Thread-2: Mon Oct 22 15:49:07 2018


Thread-3: Mon Oct 22 15:49:07 2018

Thread-1: Mon Oct 22 15:49:08 2018

Thread-3: Mon Oct 22 15:49:09 2018

Thread-2: Mon Oct 22 15:49:09 2018
Thread-1: Mon Oct 22 15:49:09 2018


Thread-1: Mon Oct 22 15:49:10 2018

Exiting Thread-1... 
 
Thread-3: Mon Oct 22 15:49:11 2018

Thread-2: Mon Oct 22 15:49:11 2018

Thread-3: Mon Oct 22 15:49:13 2018

Thread-2: Mon Oct 22 15:49:13 2018

Thread-3: Mon Oct 22 15:49:15 2018

Exiting Thread-3... 
 
Thread-2: Mon Oct 22 15:49:15 2018

Exiting Thread-2... 
 
Exiting Main Thread


### Módulo [Multiprocessing](https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing)

In [17]:
from multiprocessing import Process
import os

def info(title):
 print(title)
 print('module name:', __name__)
 print('parent process:', os.getppid())
 print('process id:', os.getpid())

def f(name):
 info('function f')
 print('hello', name)

if __name__ == '__main__':
 info('main line')
 p = Process(target=f, args=('bob',))
 p.start()
 p.join()

main line
module name: __main__
parent process: 56048
process id: 57151
function f
module name: __main__
parent process: 57151
process id: 76230
hello bob


### Módulo [Concurrent futures](https://docs.python.org/3/library/concurrent.futures.html)

In [11]:
import concurrent.futures
import urllib.request

In [15]:
PRIMES = [
 112272535095293,
 112582705942171,
 112272535095293,
 115280095190773,
 115797848077099,
 1099726899285419]

def is_prime(n):
 if n % 2 == 0:
 return False
 sqrt_n = int(math.floor(math.sqrt(n)))
 for i in range(3, sqrt_n + 1, 2):
 if n % i == 0:
 return False
 return True

def main():
 with concurrent.futures.ProcessPoolExecutor() as executor:
 for number, prime in zip(PRIMES, executor.map(is_prime, PRIMES)):
 print('%d is prime: %s' % (number, prime))

if __name__ == '__main__':
 main()

112272535095293 is prime: True
112582705942171 is prime: True
112272535095293 is prime: True
115280095190773 is prime: True
115797848077099 is prime: True
1099726899285419 is prime: False


In [13]:
URLS = ['http://www.foxnews.com/',
 'http://www.cnn.com/',
 'http://europe.wsj.com/',
 'http://www.bbc.co.uk/',
 'http://www.oglobo.com/']

# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
 with urllib.request.urlopen(url, timeout=timeout) as conn:
 return conn.read()

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
 # Start the load operations and mark each future with its URL
 future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
 for future in concurrent.futures.as_completed(future_to_url):
 url = future_to_url[future]
 try:
 data = future.result()
 except Exception as exc:
 print('%r generated an exception: %s' % (url, exc))
 else:
 print('%r page is %d bytes' % (url, len(data)))

'http://www.oglobo.com/' page is 134590 bytes
'http://europe.wsj.com/' page is 994081 bytes
'http://www.cnn.com/' page is 1723318 bytes
'http://www.foxnews.com/' page is 230195 bytes
'http://www.bbc.co.uk/' page is 305605 bytes
