Notes Data Science 1
Notes Data Science 1
#Chapter 1
'''
1 Use of data mining
2 Data Mining
4 Data Science, Engineering, and Data-driven decision making
6 Target example
7 Automated decision making
8 Big data
9 Big data 1 to Big Data 2
9 Data as an asset
10 Credit cards
11 Success stories
12 Data analytic thinking
14 Data Mining and data science
15 Data Science vs Data Scientist
'''
#Chapter 2
#A set of canonical data mining tasks; The data mining process; Supervised vs
Unsupervised data mining
'''
20 Classification and class probability estimation
21 Regression
21 Similarity matching
21 Clustering
21 Co-occurrence grouping (frequent itemset mining, association rule discovery,
market-based analysis)
22 Profiling
22 Data reduction
23 Causal modeling
24 Supervised vs Unsupervised methods
26 The data mining process
27 CRISP
27 Business Understanding
29 Data preparation
30 Leak
31 Evaluating the results of data mining
35 Statistics
37 Database Querying
38 Data Warehousing
39 Regression analysis
39 Machine Learning and Data Mining
'''
#NUMPY
import numpy as np
arr = np.arr([1, 2, 3, 4, 5])
print(type(arr))
OUTPUT: numpy.ndarray
def calc_prob(f_x):
#f_x = -1 * f_x
return (1/(1 + np.exp(-1*f_x)))
arr3d[:1,:1] = 900
arr3d
#######################
#######################
#######################
arr3d[:1] = 600
arr3d
#######################
arr3d[:1,:1] = 900
arr3d
#######################
A = np.array([5, 5, 5])
B = np.array([6, 6, 6])
dot_product = np.dot(A, B)
print(dot_product)
#######################
#######################
#######################
#########################
## Entropy Calculation ##
#########################
#######################
#######################
# calculate the entropy for an unfair dice where even numbers show up twice as
often as odd numbers
from scipy.stats import entropy
# discrete probabilities
p = [0.67/6, (0.67*2)/6, 0.67/6, (0.67*2)/6, 0.67/6, (0.67*2)/6]
# calculate entropy
e = entropy(p, base=6)
# print the result
print('entropy: %.3f bits' % e)
#######################
# calculate the entropy for coin flip where heads comes 75% of times and tails
comes 25% of the times
from scipy.stats import entropy
# discrete probabilities
p = [3/4, 1/4]
# calculate entropy
e = entropy(p, base=2)
# print the result
print('entropy: %.3f bits' % e)
#######################
#######################
#######################
#######################
#######################
#######################
## MASKING ##
#######################
data = np.arange(15).reshape(5,3)
alpha = np.array(['A','B','C','A','C'])
alpha == 'A'
data[alpha=='A']
OUT: array([[ 0, 1, 2],
[ 9, 10, 11]])
# https://numpy.org/doc/stable/reference/ufuncs.html