algorithmeknn-121213175830-phpapp02
algorithmeknn-121213175830-phpapp02
Normalization Reinforcement
Logits
Natural Language Processing
Entropy
Big Data
Computer
Data Mining Science Optimization
Machine Learning
Practitioner Domains
Machine Learning
Hacker Practitioner Continuum Academic
What is Machine Learning?
Learning by Example
Given a bunch of examples (data) extract a
meaningful pattern upon which to act.
Users
Data
Machine Learning
Computer
Science
Machine or App
Types of Algorithms by Output
Input training data to fit a model which is then
used to predict incoming inputs into ...
● Ridge Regression
● LASSO (Least Absolute Shrinkage & Selection Operator)
● Elastic Net
Models: Decision Trees
Model of decisions based on data attributes.
Predictions are made by following forks in a
tree structure until a decision is made. Used for
classification & regression.
● Naive Bayes
● Averaged One-Dependence Estimators (AODE)
● Bayesian Belief Network (BBN)
Models: Kernel Methods
Map input data into higher dimensional vector
space where the problem is easier to model.
Named after the “kernel trick” which computes
the inner product of images of pairs of data.
● k-Means
● Affinity Propegation
● OPTICS (Ordering Points to Identify Cluster Structure)
● Agglomerative Clustering
Models: Artificial Neural Networks
Inspired by biological neural networks, ANNs are
nonlinear function approximators that estimate
functions with a large number of inputs.
- System of interconnected neurons that activate
- Deep learning extends simple networks recursively
● Perceptron
● Back-Propagation
● Hopfield Network
● Restricted Boltzmann Machine (RBM)
● Deep Belief Networks (DBN)
Models: Ensembles
Models composed of multiple weak models that
are trained independently and whose outputs
are combined to make an overall prediction.
● Boosting
● Bootstrapped Aggregation (Bagging)
● AdaBoost
● Stacked Generalization (blending)
● Gradient Boosting Machines (GBM)
● Random Forest
Models: Other
The list before was not comprehensive, other
algorithm and model classes include:
● Conditional Random Fields (CRF)
● Markovian Models (HMMs)
● Dimensionality Reduction (PCA, PLS)
● Rule Learning (Apriori, Brill)
● More ...
An Architecture for Operationalizing
Machine Learning Algorithms
Build Phase
Feature
Training Data Vectors
Estimation
Algorithm
Labels
Operational Phase
Model Building
Service/API
Feedback
- Scikit-Learn Tutorial
class Estimator(object):
estimator = svm.SVC(gamma=0.001)
estimator.fit(X, y)
estimator.predict(x)
Basic methodology
Wrapping fit and predict
We’ve already discussed a broad workflow, the
following is a development workflow:
Feature Feature
Raw Data
Extraction Evaluation
Load &
Build Model Evaluate Model
Transform Data
class Transformer(Estimator):
Xt = preprocessing.normalize(X) # Normalizer
Xt = preprocessing.scale(X) # StandardScaler
imputer =Imputer(missing_values='Nan',
strategy='mean')
Xt = imputer.fit_transform(X)
Transformers
Cross Validation (classification)
Assess how model will generalize to independent data set
(e.g. data not in the training set).
model = ClassifierEstimator()
model.fit(X_train, y_train)
expected = y_test
predicted = model.predict(X_test)
MSE = np.mean((predicted-expected)**2)
model = RegressionEstimator()
model.fit(X_train, y_train)
expected = y_test
predicted = model.predict(y_test)