15 Math Concepts Every Data Scientist Should Know: Understand and learn how to apply the math behind data science algorithms

Ebook1,285 pages9 hours

15 Math Concepts Every Data Scientist Should Know: Understand and learn how to apply the math behind data science algorithms

Name: 15 Math Concepts Every Data Scientist Should Know: Understand and learn how to apply the math behind data science algorithms
Author: David Hoyle
ISBN: 9781837631940

By David Hoyle

Rating: 0 out of 5 stars

()

Read preview

Skip carousel

LanguageEnglish

PublisherPackt Publishing

Release dateAug 16, 2024

ISBN9781837631940

Author

David Hoyle

David Hoyle is an international management consultant with over 30 years' experience in quality management. He has held senior positions in quality management with British Aerospace and Ferranti International and worked with such companies as General Motors, the UK Civil Aviation Authority and Bell Atlantic on their quality improvement programmes. As well as delivering quality management and auditor training courses throughout the world, he has participated in various industry councils and committees, including the Institute of Quality Assurance.

Related to 15 Math Concepts Every Data Scientist Should Know

Related ebooks

Skip carousel

Principles of Data Science: A beginner's guide to essential math and coding skills for data fluency and machine learning
Ebook
Principles of Data Science: A beginner's guide to essential math and coding skills for data fluency and machine learning
bySinan Ozdemir
Rating: 0 out of 5 stars
0 ratings
Bayesian Analysis with Python: A practical guide to probabilistic modeling
Ebook
Bayesian Analysis with Python: A practical guide to probabilistic modeling
byOsvaldo Martin
Rating: 3 out of 5 stars
3/5
Start Predicting In A World Of Data Science And Predictive Analysis
Ebook
Start Predicting In A World Of Data Science And Predictive Analysis
byMatthew Abbitt
Rating: 0 out of 5 stars
0 ratings
An Introduction to Statistical Computing: A Simulation-based Approach
Ebook
An Introduction to Statistical Computing: A Simulation-based Approach
byJochen Voss
Rating: 0 out of 5 stars
0 ratings
Data Science Career Guide Interview Preparation
Ebook
Data Science Career Guide Interview Preparation
byGradient Publication
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning Complete Self-Assessment Guide
Ebook
Python Machine Learning Complete Self-Assessment Guide
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
The Malliavin Calculus
Ebook
The Malliavin Calculus
byDenis R. Bell
Rating: 5 out of 5 stars
5/5
Mastering Time Series Analysis and Forecasting with Python: Bridging Theory and Practice Through Insights, Techniques, and Tools for Effective Time Series Analysis in Python
Ebook
Mastering Time Series Analysis and Forecasting with Python: Bridging Theory and Practice Through Insights, Techniques, and Tools for Effective Time Series Analysis in Python
bySulekha Aloorravi
Rating: 0 out of 5 stars
0 ratings
Pearls in Graph Theory: A Comprehensive Introduction
Ebook
Pearls in Graph Theory: A Comprehensive Introduction
byNora Hartsfield
Rating: 4 out of 5 stars
4/5
Ben Graham Was a Quant: Raising the IQ of the Intelligent Investor
Ebook
Ben Graham Was a Quant: Raising the IQ of the Intelligent Investor
bySteven P. Greiner
Rating: 5 out of 5 stars
5/5
macOS Big Sur Demystified: Most Well-guarded Secrets to Crack macOS Big Sur to Pro Level Revealed
Ebook
macOS Big Sur Demystified: Most Well-guarded Secrets to Crack macOS Big Sur to Pro Level Revealed
byBrian McShore
Rating: 0 out of 5 stars
0 ratings
Intelligent Computational Systems: A Multi-Disciplinary Perspective
Ebook
Intelligent Computational Systems: A Multi-Disciplinary Perspective
byPublishDrive
Rating: 0 out of 5 stars
0 ratings
Sketches in Quantitative Finance A Translation of Bachelier's Le Jeu, la Chance et le Hasard
Ebook
Sketches in Quantitative Finance A Translation of Bachelier's Le Jeu, la Chance et le Hasard
byHarding Edward
Rating: 0 out of 5 stars
0 ratings
Wavelet Neural Networks: With Applications in Financial Engineering, Chaos, and Classification
Ebook
Wavelet Neural Networks: With Applications in Financial Engineering, Chaos, and Classification
byAntonios K. Alexandridis
Rating: 0 out of 5 stars
0 ratings
Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization
Ebook
Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization
byRufus Isaacs
Rating: 4 out of 5 stars
4/5
Data Engineering Best Practices: Architect robust and cost-effective data solutions in the cloud era
Ebook
Data Engineering Best Practices: Architect robust and cost-effective data solutions in the cloud era
byRichard J. Schiller
Rating: 0 out of 5 stars
0 ratings
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
Ebook
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
byCésar Pérez López
Rating: 0 out of 5 stars
0 ratings
Machine Learning: Hands-On for Developers and Technical Professionals
Ebook
Machine Learning: Hands-On for Developers and Technical Professionals
byJason Bell
Rating: 0 out of 5 stars
0 ratings
Professional Python
Ebook
Professional Python
byLuke Sneeringer
Rating: 0 out of 5 stars
0 ratings
F# for Machine Learning Essentials: Get up and running with machine learning with F# in a fun and functional way
Ebook
F# for Machine Learning Essentials: Get up and running with machine learning with F# in a fun and functional way
bySudipta Mukherjee
Rating: 0 out of 5 stars
0 ratings
Alternating Decision Tree: Fundamentals and Applications
Ebook
Alternating Decision Tree: Fundamentals and Applications
byFouad Sabry
Rating: 0 out of 5 stars
0 ratings
Practice Makes Perfect Linear Algebra: With 500 Exercises
Ebook
Practice Makes Perfect Linear Algebra: With 500 Exercises
bySandra Luna McCune
Rating: 0 out of 5 stars
0 ratings
PyTorch Cookbook: 100+ Solutions across RNNs, CNNs, python tools, distributed training and graph networks
Ebook
PyTorch Cookbook: 100+ Solutions across RNNs, CNNs, python tools, distributed training and graph networks
byMatthew Rosch
Rating: 0 out of 5 stars
0 ratings
C++ for Financial Engineers Complete Self-Assessment Guide
Ebook
C++ for Financial Engineers Complete Self-Assessment Guide
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Dynamic Bayesian Networks: Fundamentals and Applications
Ebook
Dynamic Bayesian Networks: Fundamentals and Applications
byFouad Sabry
Rating: 0 out of 5 stars
0 ratings
Machine Learning for the Web
Ebook
Machine Learning for the Web
byAndrea Isoni
Rating: 0 out of 5 stars
0 ratings
Synthetic Data Generation: A Beginner’s Guide
Ebook
Synthetic Data Generation: A Beginner’s Guide
byRobert Johnson
Rating: 0 out of 5 stars
0 ratings
Group Theory in Solid State Physics and Photonics: Problem Solving with Mathematica
Ebook
Group Theory in Solid State Physics and Photonics: Problem Solving with Mathematica
byWolfram Hergert
Rating: 0 out of 5 stars
0 ratings
A Short Course in Automorphic Functions
Ebook
A Short Course in Automorphic Functions
byJoseph Lehner
Rating: 0 out of 5 stars
0 ratings
Peak Performance: Nutrition Strategies for Athletes
Ebook
Peak Performance: Nutrition Strategies for Athletes
byOlivia H. Mirela
Rating: 0 out of 5 stars
0 ratings

Programming For You

Skip carousel

Python Programming For Beginners: Learn The Basics Of Python Programming (Python Crash Course, Programming for Dummies)
Ebook
Python Programming For Beginners: Learn The Basics Of Python Programming (Python Crash Course, Programming for Dummies)
byJames Tudor
Rating: 5 out of 5 stars
5/5
Learn SQL in 24 Hours
Ebook
Learn SQL in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Coding All-in-One For Dummies
Ebook
Coding All-in-One For Dummies
byNikhil Abraham
Rating: 4 out of 5 stars
4/5
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
Ebook
Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1
byKevin Clark
Rating: 5 out of 5 stars
5/5
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 5 out of 5 stars
5/5
Python: Learn Python in 24 Hours
Ebook
Python: Learn Python in 24 Hours
byAlex Nordeen
Rating: 4 out of 5 stars
4/5
JavaScript All-in-One For Dummies
Ebook
JavaScript All-in-One For Dummies
byChris Minnick
Rating: 5 out of 5 stars
5/5
Microsoft Azure For Dummies
Ebook
Microsoft Azure For Dummies
byJack A. Hyman
Rating: 0 out of 5 stars
0 ratings
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
Python Programming for Beginners: A Comprehensive Crash Course With Practical Exercises to Quickly Learn Coding and Programming for Data Analysis and Machine Learning
Ebook
Python Programming for Beginners: A Comprehensive Crash Course With Practical Exercises to Quickly Learn Coding and Programming for Data Analysis and Machine Learning
byAnthony Adams
Rating: 4 out of 5 stars
4/5
Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time!
Ebook
Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time!
byJohannes Wild
Rating: 0 out of 5 stars
0 ratings
Microsoft Office 365 Bible: 10:1 Mastery | Excel in Your Profession, Enhance Time Management, and Foster Exceptional Collaboration [III EDITION]
Ebook
Microsoft Office 365 Bible: 10:1 Mastery | Excel in Your Profession, Enhance Time Management, and Foster Exceptional Collaboration [III EDITION]
byKevin Pitch
Rating: 5 out of 5 stars
5/5
Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps
Ebook
Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps
byJason Scotts
Rating: 4 out of 5 stars
4/5
Algorithms For Dummies
Ebook
Algorithms For Dummies
byJohn Paul Mueller
Rating: 4 out of 5 stars
4/5
Linux: Learn in 24 Hours
Ebook
Linux: Learn in 24 Hours
byAlex Nordeen
Rating: 5 out of 5 stars
5/5
The Advanced Roblox Coding Book: An Unofficial Guide, Updated Edition: Learn How to Script Games, Code Objects and Settings, and Create Your Own World!
Ebook
The Advanced Roblox Coding Book: An Unofficial Guide, Updated Edition: Learn How to Script Games, Code Objects and Settings, and Create Your Own World!
byHeath Haskins
Rating: 4 out of 5 stars
4/5
The JavaScript Workshop: Learn to develop interactive web applications with clean and maintainable JavaScript code
Ebook
The JavaScript Workshop: Learn to develop interactive web applications with clean and maintainable JavaScript code
byJoseph Labrecque
Rating: 4 out of 5 stars
4/5
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
Ebook
Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.
byGwendolyn Faraday
Rating: 5 out of 5 stars
5/5
Game Development with Unreal Engine 5: Learn the Basics of Game Development in Unreal Engine 5 (English Edition)
Ebook
Game Development with Unreal Engine 5: Learn the Basics of Game Development in Unreal Engine 5 (English Edition)
byMitchell Lynn
Rating: 3 out of 5 stars
3/5
Godot from Zero to Proficiency (Foundations): Godot from Zero to Proficiency, #1
Ebook
Godot from Zero to Proficiency (Foundations): Godot from Zero to Proficiency, #1
byPatrick Felicia
Rating: 5 out of 5 stars
5/5
SQL All-in-One For Dummies
Ebook
SQL All-in-One For Dummies
byAllen G. Taylor
Rating: 3 out of 5 stars
3/5
Learn AI with Python: Explore Machine Learning and Deep Learning techniques for Building Smart AI Systems Using Scikit-Learn, NLTK, NeuroLab, and Keras (English Edition)
Ebook
Learn AI with Python: Explore Machine Learning and Deep Learning techniques for Building Smart AI Systems Using Scikit-Learn, NLTK, NeuroLab, and Keras (English Edition)
byGaurav Leekha
Rating: 5 out of 5 stars
5/5
PYTHON PROGRAMMING
Ebook
PYTHON PROGRAMMING
byRamsey Hamilton
Rating: 4 out of 5 stars
4/5
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
Ebook
PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project
byMark Chan
Rating: 5 out of 5 stars
5/5
The Ultimate Roblox Book: An Unofficial Guide, Updated Edition: Learn How to Build Your Own Worlds, Customize Your Games, and So Much More!
Ebook
The Ultimate Roblox Book: An Unofficial Guide, Updated Edition: Learn How to Build Your Own Worlds, Customize Your Games, and So Much More!
byDavid Jagneaux
Rating: 0 out of 5 stars
0 ratings
Beginning Programming with C++ For Dummies
Ebook
Beginning Programming with C++ For Dummies
byStephen R. Davis
Rating: 4 out of 5 stars
4/5
Learn NodeJS in 1 Day: Complete Node JS Guide with Examples
Ebook
Learn NodeJS in 1 Day: Complete Node JS Guide with Examples
byKrishna Rungta
Rating: 3 out of 5 stars
3/5
Python Data Structures and Algorithms
Ebook
Python Data Structures and Algorithms
byBenjamin Baka
Rating: 5 out of 5 stars
5/5
Learn Python Programming for Beginners: The Best Step-by-Step Guide for Coding with Python, Great for Kids and Adults. Includes Practical Exercises on Data Analysis, Machine Learning and More.
Ebook
Learn Python Programming for Beginners: The Best Step-by-Step Guide for Coding with Python, Great for Kids and Adults. Includes Practical Exercises on Data Analysis, Machine Learning and More.
byFlynn Fisher
Rating: 4 out of 5 stars
4/5

Related categories

Skip carousel

Reviews for 15 Math Concepts Every Data Scientist Should Know

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

15 Math Concepts Every Data Scientist Should Know - David Hoyle

Cover.jpg

15 Math Concepts Every Data Scientist Should Know

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

Group Product Manager: Niranjan Naikwadi

Publishing Product Manager: Yasir Ali Khan

Content Development Editor: Joseph Sunil

Technical Editor: Seemanjay Ameriya

Copy Editor: Safis Editing

Project Coordinator: Urvi Sharma

Proofreader: Safis Editing

Indexer: Hemangini Bari

Production Designer: Joshua Misquitta

Marketing Coordinator: Vinishka Kalra

First published: July 2024

Production reference: 2221024

Published by Packt Publishing Ltd.

Grosvenor House

11 St Paul’s Square

Birmingham

B3 1RB, UK

ISBN 978-1-83763-418-7

www.packtpub.com

To my wife Clare for her unwavering love, support, and inspiration throughout our life together.

– David Hoyle

Contributors

About the author

David Hoyle has over 30 years’ experience in machine learning, statistics, and mathematical modeling. He gained a BSc. degree in mathematics and physics and a Ph.D. in theoretical physics, both from the University of Bristol, UK. He then embarked on an academic career that included research at the University of Cambridge and leading his own research groups as an Associate Professor at the University of Exeter and the University of Manchester in the UK. For the last 13 years, he has worked in the commercial sector, including for Lloyds Banking Group – one of the UK’s largest retail banks, and as joint Head of Data Science for AutoTrader UK. He now works for the global customer data science company dunnhumby, building statistical and machine learning models for the world’s largest retailers, including Tesco UK and Walmart. He lives and works in Manchester, UK.

This has been a long endeavor. I would like to thank my wife and children for their encouragement, and the team at Packt for their patience and support throughout the process.

About the reviewer

Emmanuel Nyatefe is a data analyst with over 5 years of experience in data analytics, AI, and ML. He holds a Masters of Science in Business Analytics from the W. P. Carey School of Business at Arizona State University and a Bachelors of Science in Business Information Technology from Kwame Nkrumah University of Science and Technology. He has led various AI and ML projects, including developing models for detecting crop diseases and applying Generative AI to innovate business solutions and optimize operations. His expertise in data engineering, modeling, and visualization, alongside his proficiency in LLMs and advanced analytics, highlights his significant contributions to data science. His dedication to data-driven innovation is evident in his book review.

Table of Contents

Preface

Part 1: Essential Concepts

Recap of Mathematical Notation and Terminology

Technical requirements

Number systems

Notation for numbers and fields

Complex numbers

What we learned

Linear algebra

Vectors

Matrices

What we learned

Sums, products, and logarithms

Sums and the 𝚺 notation

Products and the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>bold>Π notation

Logarithms

What we learned

Differential and integral calculus

Differentiation

Finding maxima and minima

Integration

What we learned

Analysis

Limits

Order notation

Taylor series expansions

What we learned

Combinatorics

Binomial coefficients

What we learned

Summary

Notes and further reading

Random Variables and Probability Distributions

Technical requirements

All data is random

A little example

Systematic variation can be learned – random variation can’t

Random variation is not just measurement error

What are the consequences of data being random?

What we learned

Random variables and probability distributions

A new concept – random variables

Summarizing probability distributions

Continuous distributions

Transforming and combining random variables

Named distributions

What we learned

Sampling from distributions

How datasets relate to random variables and probability distributions

How big is the population from which a dataset is sampled?

How to sample

Generating your own random numbers code example

Sampling from numpy distributions code example

What we learned

Understanding statistical estimators

Consistency, bias, and efficiency

The empirical distribution function

What we learned

The Central Limit Theorem

Sums of random variables

CLT code example

CLT example with discrete variables

Computational estimation of a PDF from data

KDE code example

What we learned

Summary

Exercises

Matrices and Linear Algebra

Technical requirements

Inner and outer products of vectors

Inner product of two vectors

Outer product of two vectors

What we learned

Matrices as transformations

Matrix multiplication

The identity matrix

The inverse matrix

More examples of matrices as transformations

Matrix transformation code example

What we learned

Matrix decompositions

Eigen-decompositions

Eigenvector and eigenvalues

Eigen-decomposition of a square matrix

Eigen-decomposition code example

Singular value decomposition

The SVD of a complex matrix

What we learned

Matrix properties

Trace

Determinant

What we learned

Matrix factorization and dimensionality reduction

Dimensionality reduction

Principal component analysis

Non-negative matrix factorization

What we learned

Summary

Exercises

Notes and further reading

Loss Functions and Optimization

Technical requirements

Loss functions – what are they?

Risk functions

There are many loss functions

Different loss functions = different end results

Loss functions for anything

A loss function by any other name

What we learned

Least Squares

The squared-loss function

OLS regression

OLS, outliers, and robust regression

What we learned

Linear models

Practical issues

The model residuals

OLS regression code example

What we learned

Gradient descent

Locating the minimum of a simple risk function

Gradient descent code example

Gradient descent is a general technique

Beyond simple gradient descent

What we learned

Summary

Exercises

Probabilistic Modeling

Technical requirements

Likelihood

A simple probabilistic model

Log likelihood

Maximum likelihood estimation

What we have learned

Bayes’ theorem

Conditional probability and Bayes’ theorem

Priors

The posterior

What we have learned

Bayesian modeling

Bayesian model averaging

MAP estimation

As http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>bold-italic>N becomes large the prior becomes irrelevant

Least squares as an approximation to Bayesian modeling

What we have learned

Bayesian modeling in practice

Analytic approximation of the posterior

Computational sampling

MCMC code example

Probabilistic programming languages

What we have learned

Summary

Exercises

Part 2: Intermediate Concepts

Time Series and Forecasting

Technical requirements

What is time series data?

What does auto-correlation mean for modeling time series data?

The auto-correlation function (ACF)

The partial auto-correlation function (PACF)

Other data science implications of time series data

What we have learned

ARIMA models

Integrated

Auto-regression

Moving average

Combining the AR(p), I(d), and MA(q) into an ARIMA model

Variants of ARIMA modeling

What we have learned

ARIMA modeling in practice

Unit root testing

Interpreting ACF and PACF plots

auto.arima

What we have learned

Machine learning approaches to time series analysis

Routine application of machine learning to time series analysis

Deep learning approaches to time series analysis

AutoML approaches to time series analysis

What we have learned

Summary

Exercises

Notes and further reading

Hypothesis Testing

Technical requirements

What is a hypothesis test?

Example

The general form of a hypothesis test

The p-value

The effect of increasing sample size

The effect of decreasing noise

One-tailed and two-tailed tests

Using samples variances in the test statistic – the t-test

Computationally intensive methods for p-value estimation

Parametric versus non-parametric hypothesis tests

What we learned

Confidence intervals

What does a confidence interval really represent?

Confidence intervals for any parameter

A confidence interval code example

What we learned

Type I and Type II errors, and power

What we learned

Summary

Exercises

Notes and further reading

Model Complexity

Technical requirements

Generalization, overfitting, and the role of model complexity

Overfitting

Why overfitting is bad

Overfitting increases the variability of predictions

Underfitting is also a problem

Measuring prediction error

What we learned

The bias-variance trade-off

Proof of the bias-variance trade-off formula

Double descent – a modern twist on the generalization error diagram

What we learned

Model complexity measures for model selection

Selecting between classes of models

Akaike Information Criterion

Bayesian Information Criterion

What we learned

Summary

Notes and further reading

Function Decomposition

Technical requirements

Why do we want to decompose a function?

What is a decomposition of a function?

Example 1 – decomposing a one-dimensional function into symmetric and anti-symmetric parts

Example 2 – decomposing a time series into its seasonal and non-seasonal components

What we’ve learned

Expanding a function in terms of basis functions

What we’ve learned

Fourier series

What we’ve learned

Fourier transforms

The multi-dimensional Fourier transform

What we’ve learned

The discrete Fourier transform

DFT code example

Uses of the DFT

What is the difference between the DFT, Fourier series, and the Fourier transform?

What we’ve learned

Summary

Exercises

Network Analysis

Technical requirements

Graphs and network data

Network data is about relationships

Example 1 – substituting goods in a supermarket

Example 2 – international trade

What is a graph?

What we’ve learned

Basic characteristics of graphs

Undirected and directed edges

The adjacency matrix

In-degree and out-degree

Centrality

What we’ve learned

Different types of graphs

Fully connected graphs

Disconnected graphs

Directed acyclic graphs

Small-world networks

Scale-free networks

What we’ve learned

Community detection and decomposing graphs

What is a community?

How to do community detection

Community detection algorithms

Community detection code example

What we’ve learned

Summary

Exercises

Notes and further reading

Part 3: Selected Advanced Concepts

Dynamical Systems

Technical requirements

What is a dynamical system and what is an evolution equation?

Time can be discrete or continuous

Time does not have to mean chronological time

Evolution equations

What we learned

First-order discrete Markov processes

Variations of first-order Markov processes

A Markov process is a probabilistic model

The transition probability matrix

Properties of the transition probability matrix

Epidemic modeling with a first-order discrete Markov process

The transition probability matrix is a network

Using the transition matrix to generate state trajectories

Evolution of the state probability distribution

Stationary distributions and limiting distributions

First-order discrete Markov processes are memoryless

Likelihood of the state sequence

What we learned

Higher-order discrete Markov processes

Second-order discrete Markov processes

Evolution of the state probability distribution in higher-order models

A higher-order discrete Markov process is a first-order discrete Markov process in disguise

Higher-order discrete Markov processes are still memoryless

What we learned

Hidden Markov Models

Emission probabilities

Making inferences with an HMM

What we learned

Summary

Exercises

Notes and further reading

Kernel Methods

Technical requirements

The role of inner products in common learning algorithms

Sometimes we need new features in our inner products

What we learned

The kernel trick

What is a kernel?

Commonly used kernels

Kernel functions for other mathematical objects

Combining kernels

Positive semi-definite kernels

Mercer’s theorem and the kernel trick

Kernelized algorithms

What we learned

An example of a kernelized learning algorithm

kFDA code example

What we learned

Summary

Exercises

Information Theory

Technical requirements

What is information and why is it useful?

The concept of information

The mathematical definition of information

Information theory applies to continuous distributions as well

Why we measure information on a logarithmic scale

Why is quantifying information useful?

What we’ve learned

Entropy as expected information

Entropy

What we’ve learned

Mutual information

Conditional entropy

Mutual information for continuous variables

Mutual information as a measure of correlation

Mutual information code example

What we’ve learned

The Kullback-Leibler divergence

Relative entropy

KL-divergence for continuous variables

Using the KL-divergence for approximation

Variational inference

What we’ve learned

Summary

Exercises

Notes and further reading

Non-Parametric Bayesian Methods

Technical requirements

What are non-parametric Bayesian methods?

We still have parameters

The different types of non-parametric Bayesian methods

The pros and cons of non-parametric Bayesian methods

What we learned

Gaussian processes

The kernel function

Fitting GPR models

Prediction using GPR models

GPR code example

What we learned

Dirichlet processes

How do DPs differ from GPs?

The DP notation

Sampling a function from a DP

Generating a sample of data from a DP

Bayesian non-parametric inference using a DP

What we learned

Summary

Exercises

Random Matrices

Technical requirements

What is a random matrix?

What we learned

Using random matrices to represent interactions in large-scale systems

What we learned

Universal behavior of large random matrices

The Wigner semicircle law

What does RMT study?

Universal is universal

The classical Gaussian matrix ensembles

What we learned

Random matrices and high-dimensional covariance matrices

The Marčenko-Pastur distribution is a bulk distribution

Universality in the singular values of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>false>false>bold-italic>X__

The Marčenko-Pastur distribution and neural networks

What we learned

Summary

Exercises

Notes and further reading

Index

Other Books You May Enjoy

Preface

This is not a book about a specific technology or programming language. This is a book about mathematics. And mathematics is a language. It is the language of science, and so it is the language of data science as well. We can say beautiful things with that language. Just as a piece of great literature is more than a large collection of individual letters, a mathematical equation is more than just a collection of symbols. An equation conveys a way of thinking about a data science problem. It conveys a concept or an idea. If you want to fully exploit the power of those ideas and adapt them to your own data science work, you need to move beyond just recognizing the symbols in an equation and move towards understanding what that equation is really telling you.

Many people are not confident in reading and interpreting mathematical equations and mathematical ideas. And yet, as with great literature, once someone guides us through the nuances and subtexts, their beauty is revealed and becomes obvious. That is what this book aims to do.

This book will not make you an expert in every area of mathematics. Instead, it will give you enough skills and confidence to read and navigate mathematical equations and ideas on your own. We do that by walking you through the core concepts that underpin many data science algorithms – the 15 math concepts of the book’s title. We also do that by walking through those concepts slowly and in detail. I am not a fan of mathematics books that consist solely of theorems, lemmas, and proofs. Instead, this book is unapologetically long-form math. When we introduce an equation, we will explain what the equation tells us, what its implications and ramifications are, and how it connects to other parts of math. We also illustrate those concepts with code examples in Python.

At the end of the book, you will be equipped to look at the math equations of any data science algorithm and confidently unpack what that algorithm is trying to do.

Who this book is for

This book is for data scientists and machine learning engineers who have been using data science and machine learning techniques, software, and Python packages such as scikit-learn, but without necessarily fully understanding the mathematics behind the algorithms. This could include the following types of people:

Data scientists who have a college/undergraduate degree in a numerate subject and so have a basic understanding of mathematics, but they want to learn more, particularly those bits of mathematics that will be helpful in their roles as data scientists.

Data scientists who have a good understanding of some of the mathematics behind bits of data science but want to discover some new math concepts that will be useful to them in their data science work.

Data scientists who have business or data science problems they need to solve, but existing software does not provide appropriate algorithms. They want to construct their own algorithms but lack the mathematical guidance on how to apply mathematics to the new data science problems.

What this book covers

Chapter 1

, Recap of Mathematical Notation and Terminology, provides a summary of the main mathematical notation you will encounter in this book and that we expect you to already be familiar with.

Chapter 2

, Random Variables and Probability Distributions, introduces the idea that all data contains some degree of randomness, and that random variables and their associated probability distributions are the natural way to describe that randomness. The chapter teaches you how to sample from a probability distribution, understand statistical estimators, and about the Central Limit Theorem.

Chapter 3

, Matrices and Linear Algebra, introduces vectors and matrices as the basic mathematical structures we use to represent and transform data. It then shows how matrices can be broken down into simple-to-understand parts using techniques such as eigen-decomposition and singular value decomposition. The chapter finishes with explanations of how these decomposition methods are applied to principal component analysis (PCA) and non-negative matrix factorization (NMF).

Chapter 4, Loss Functions and Optimization, starts by introducing loss functions, risk functions, and empirical risk functions. The concept of minimizing an empirical risk function to estimate the parameters of a model is explained, before introducing Ordinary Least Squares estimation of linear models. Finally, gradient descent is illustrated as a general technique for minimizing risk functions.

Chapter 5

, Probabilistic Modeling, introduces the concept of building predictive models that explicitly account for the random component within data. The chapter starts by introducing likelihood and maximum likelihood estimation, before introducing Bayes’ theorem and Bayesian inference. The chapter finishes with an illustration of Markov Chain Monte Carlo and importance sampling from the posterior distribution of a model’s parameters.

Chapter 6

, Time Series and Forecasting, introduces time series data and the concept of auto-correlation as the main characteristic that distinguishes time series data from other types of data. It then describes the classical ARIMA approach to modeling time series data. Finally, it ends with a summary of concepts behind modern machine learning approaches to time series analysis.

Chapter 7

, Hypothesis Testing, introduces what a hypothesis test is and why they are important in data science. The general form of a hypothesis test is outlined before the concepts of statistical significance and p-values are explained in depth. Next, confidence intervals and their interpretation are introduced. The chapter ends with an explanation of Type-I and Type-II errors, and power calculations.

Chapter 8

, Model Complexity, introduces the concept of how we describe and quantify model complexity and discusses its impact on the predictive accuracy of a model. The classical bias-variance trade-off view of model complexity is introduced, along with the phenomenon of double descent. The chapter finishes with an explanation of model complexity measures for model selection.

Chapter 9

, Function Decomposition, introduces the idea of decomposing or building up a function from a set of simpler basis functions. A general approach is explained first before the chapter moves on to introducing Fourier Series, Fourier Transforms, and the Discrete Fourier Transform.

Chapter 10

, Network Analysis, introduces networks, network data, and the concept that a network is a graph. The node-edge description of a graph, along with its adjacency matrix representation is explained. Next, the chapter describes different types of common graphs and their properties. Finally, the decomposition of a graph into sub-graphs or communities is explained, and various community detection algorithms are illustrated.

Chapter 11

, Dynamical Systems, introduces what a dynamical system is and explains how its dynamics are controlled by an evolution equation. The chapter then focuses on discrete Markov processes as these are the most common dynamical systems used by data scientists. First-order discrete Markov processes are explained in depth, before higher-order Markov processes are introduced. The chapter finishes with an explanation of Hidden Markov Models and a discussion of how they can be used in commercial data science applications.

Chapter 12

, Kernel Methods, starts by introducing inner-product-based learning algorithms, then moves on to explaining kernels and the kernel trick. The chapter ends with an illustration of a kernelized learning algorithm. Throughout the chapter, we emphasize how the kernel trick allows us to implicitly and efficiently construct new features and thereby uncover any non-linear structure present in a dataset.

Chapter 13

, Information Theory, introduces the concept of information and how it is measured mathematically. The main information theory concepts of entropy, conditional entropy, mutual information, and relative entropy are then explained, before practical uses of the Kullback-Leibler divergence are illustrated.

Chapter 14

, Bayesian Non-Parametric Methods, introduces the idea of using a Bayesian prior over functions when building probabilistic models. The idea is illustrated through Gaussian Processes and Gaussian Process Regression. The chapter then introduces Dirichlet Processes and how they can be used as priors for probability distributions.

Chapter 15

, Random Matrices, introduces what a random matrix is and why they are ubiquitous in science and data science. The universal properties of large random matrices are illustrated along with the classical Gaussian random matrix ensembles. The chapter finishes with a discussion of where large random matrices occur in statistical and machine learning models.

To get the most out of this book

To get the most out of this book, we assume you have at least some familiarity with high-school mathematics, such as complex numbers, basic calculus, and elementary uses of vectors and matrices. To get the most out of the code examples in the book, you should have some experience of coding in Python. You will also need access to a computer or server with a full Python installation and/or where you have privileges to run and install Python and any additional packages required.

The code examples given in each chapter, and the answers to the exercises at the end of each chapter, are available in the book’s GitHub repository as Jupyter notebooks. To run the notebooks, you will need a Jupyter installation.

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/15-Math-Concepts-Every-Data-Scientist-Should-Know

. If there’s an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/

. Check them out!

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: The following code example can be found in the Code_Examples_Chap5.ipynb notebook in the GitHub repository.

A block of code is set as follows:

map_estimate = minimize(neg_log_posterior,

x0,

method='BFGS',

options={'disp': True})

# Convert from logit(p) to p

p_optimal = np.exp(map_estimate['x'][0])/ (

1.0 + np.exp(map_estimate['x'][0]))

print(MAP estimate of success probability = , p_optimal)

Bold: Indicates a new term, an important word, or words that you see onscreen. For instance, words in menus or dialog boxes appear in bold. Here is an example: The name ARIMA stands for Auto-Regressive Integrated Moving Average models.

Tips or important notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at [email protected]

and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata

and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected]

with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com

Share your thoughts

Once you’ve read 15 Math Concepts Every Data Scientist Should Know, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page

for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere?

Is your eBook purchase not compatible with the device of your choice?

Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.

The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily

Follow these simple steps to get the benefits:

Scan the QR code or visit the link below

https://packt.link/free-ebook/9781837634187

2. Submit your proof of purchase

3. That’s it! We’ll send your free PDF and other benefits to your email directly

Part 1: Essential Concepts

In this part, we will introduce the math concepts that you will encounter again and again as a data scientist. These concepts are vital to gain a good understanding of. After a recap of basic math notation, we look at the concepts related to how data is produced and then move through to concepts related to how to transform data, finally building up to our end goal of how to model data. These concepts are essential because you will use and combine them simultaneously in your work. By the end of Part 1, you will be comfortable with the math concepts that underpin almost all data science models and algorithms.

This section contains the following chapters:

Chapter 1

, Recap of Mathematical Notation and Terminology

Chapter 2

, Random Variables and Probability Distributions

Chapter 3

, Matrices and Linear Algebra

Chapter 4

, Loss Functions and Optimization

Anchor 5

, Probabilistic Modeling

Recap of Mathematical Notation and Terminology

Our tour of math concepts will start properly in Chapter 2

. Before we begin that tour, we’ll start by recapping some mathematical notation and terminology. Mathematics is a language, and mathematical symbols and notation are its alphabet. Therefore, we must be comfortable with and understand the basics of this alphabet.

In this chapter, we will recap the most common core notation and terminology that we are likely to use repeatedly throughout the book. We have grouped the recap into six main math areas or topics. Those topics are as follows:

Number systems: In this section, we introduce notation for real and complex numbers

Linear algebra: In this section, we introduce notation for describing vectors and matrices

Sums, products, and logarithms: In this section, we introduce notation for succinctly representing sums and products, and we introduce rules for logarithms

Differential and integral calculus: In this section, we introduce basic notation for differentiation and integration

Analysis: In this section, we introduce notation for describing limits, and order notation

Combinatorics: In this section, we introduce notation for binomial coefficients

Some of this notation you may already be familiar with. For example, complex numbers, matrices, logarithms, and basic differential calculus you will have seen either in high school or in the first year of an undergraduate degree in a numerate subject. Other topics, such as order notation, you may have encountered as part of a university degree course on mathematical analysis or algorithm complexity, or it may be new to you. For the most part, the notation we recap in this chapter you will have seen before. You can skip this chapter if you want to and if you are already familiar and comfortable with the symbols and notation recapped here. You can easily come back later or read those sections that contain notation that is new to you.

We should emphasize that this chapter is a recap. It is brief. It is not meant to be an exhaustive and comprehensive review. We focus on presenting a few main facts, but also on trying to give a feel for why the notation may be useful and how it is likely to be used.

Finally, we will encounter new notation, terminology, and symbols as we progress through the book when we are discussing specific topics. We will introduce this new notation and terminology as and when we need it.

Technical requirements

As this chapter solely recaps some of the mathematical notation we will use in later chapters, there are no code examples given and hence no technical requirements for this particular chapter.

For later chapters, you will be able to find code examples at the GitHub repository: https://github.com/PacktPublishing/15-Math-Concepts-Every-Data-Scientist-Should-Know

Number systems

In this section, we introduce notation for describing sets of numbers. We will focus on the real numbers and the complex numbers.

Notation for numbers and fields

As this is a book about data science, we will be dealing with numbers. So, it will be worthwhile recapping the notation we use to refer to the most common sets of numbers.

Most of the numbers we will deal with in this book will be real numbers, such as 4.6, 1, or -2.3. We can think of them as living on the real number line shown in Figure 1.1. The real number line is a one-dimensional continuous structure. There are an infinite number of real numbers. We denote the set of all real numbers by the symbol ℝ.

Figure 1.1: The real number line

Figure 1.1: The real number line

Obviously, there will be situations where we want to restrict our datasets to, say, just integer-valued numbers. This would be the case if we were analyzing count data, such as the number of items of a particular product on an e-commerce site sold on a particular day. The integer numbers, …, -2, -1, 0, 1, 2, …, are a subset of the real numbers, and we denote them by the symbol ℤ. Despite them being a subset of the real numbers, there are still an infinite number of integers.

For the e-commerce count data that we mentioned earlier, the integer value would always be positive. If we restrict ourselves to strictly positive integers, 1, 2, 3, …, and so on, then we have the natural or counting numbers. These we denote by the symbol http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>double-struck>Z+ , clearly meaning positive integers. The fact that these strictly positive integers are the natural numbers means we also denote them using the symbol http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>double-struck>N .

As well as real numbers, we will occasionally deal with complex numbers. As the name suggests, complex numbers have more structure to them than real numbers. The complex numbers don’t live on the real number line and so are not a subset of the real numbers, but instead, they have a two-dimensional structure, which we’ll explain in a moment. We denote the set of complex numbers by the symbol http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>double-struck>C .

Sometimes, there are very specific occasions when we may want to refer to other subsets of the real numbers. Other common symbols you may encounter are http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>double-struck>Q , for the set of rational numbers, and http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>double-struck>Z2 for the two-element set http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>{ close=} separators=|>0,1 . The latter you may encounter when we talk about modeling binary discrete target variables or working with binary features.

Numbers such as 4.6 are specific instances of a real number. When we are talking about algorithms or code, we will want to talk about variables, in which case we use a symbol such as http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x to represent a number, which could take on a range of different values depending on what we do with it. But what could that range be? When we are documenting an algorithm, we may want to tell the reader that http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x will always be a real number. We do that by writing http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x∈double-struck>R , which is mathematical language for " $http://www.w3.org/1998/Math/MathML> x$ is in the set of real numbers, or more succinctly, $http://www.w3.org/1998/Math/MathML> x$ is real."

Likewise, if we wanted to say http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x was always a positive integer, then we would write http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x∈double-struck>Z+ . Or, if we wanted to say http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x was a complex number, we would write http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x∈double-struck>C .

When we have several variables that all have similar properties or that may be related in some way – for example, they represent different features of a data point in a training set – then we use subscripts to denote the different variables. For example, we would use http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x1,x2,x3 to represent three features of a dataset. Just as with the single variable http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x , if we want to say that those three features will always contain real numbers, then we would write $http://www.w3.org/1998/Math/MathML> x_{1}, x_{2}, x_{3} \in double-struck>R$ .

Complex numbers

If the real numbers live on the one-dimensional structure that is the real number line, this raises the question of whether we can have numbers that live in a two-dimensional space. Complex numbers are such numbers. A complex number, http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z , has two components or parts. These are a real part, http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x , and an imaginary part, http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>y , with both http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x,y being real numbers. The real and imaginary parts are combined, and we write the complex number http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z as follows:

Eq. 1

Figure 1.2: The complex number plane

Figure 1.2: The complex number plane

The position of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z along the x-axis is given by the real part of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z , while the position of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z along the y-axis is given by the imaginary part of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z . We also use Re http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z to denote the real part of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z , and Im http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z to denote the imaginary part of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z , so that we have the following:

Eq. 2

Consequently, we have used Re http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z and Im http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z to label the axes of the complex plane in Figure 1.2.

A number that has http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>y=0 sits entirely on the x-axis and is a purely real number. Likewise, a complex number that has http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x=0 sits entirely on the y-axis and is a purely imaginary number.

Just as with other 2D planes, we can represent a point in the complex plane not just with Cartesian coordinates http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>|>x,y but with polar coordinates as well. This is also illustrated in Figure 1.2. A quick bit of high-school trigonometry gives us the following:

Eq. 3

Eq. 4

The angle http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>θ is conventionally measured in a counterclockwise direction and in radians, so that a point on the positive y-axis would have $http://www.w3.org/1998/Math/MathML> normal>θ = normal>π / 2$ (remember http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>2normal>π radians = 360° ). Euler’s formula is as follows:

Eq. 5

This means we can also write http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z in the following form:

Eq. 6

This last form for writing a complex number will be useful when we introduce Fourier transforms, which are used to represent functions as a sum of sine and cosine waves. In fact, this is our main reason for introducing complex numbers.

One important concept relating to the complex number z is that of its complex conjugate. The complex conjugate of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z we will denote by http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>true>z- . Sometimes, the symbol http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z* is used instead. The complex conjugate http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>true>z- is related to http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z by flipping the sign of the imaginary part of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z . So, if http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z=x+iy , then http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>true>z-=x-iy . In Figure 1.3, this is shown by simply reflecting http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z in the x-axis. A useful relation that follows is the following:

Eq. 7

Figure 1.3: The complex conjugate

Figure 1.3: The complex conjugate

The integers, real numbers, and complex numbers represent the overwhelming majority of the numbers we will meet throughout this book, so this is a good place to end our recap of number systems.

Let’s summarize what we learned.

What we learned

In this section, we have learned the following:

The notations http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>double-struck>Z+ and http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>double-struck>N , for describing the strictly positive integers, also known as the natural numbers

The notation http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>double-struck>Z2 , for describing the binary set http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>{ close=} separators=|>0,1

How complex numbers have a real and an imaginary part

How a complex number http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z can also be described in terms of a modulus, http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>|z| , and a phase, http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>θ

How to calculate the complex conjugate http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>true>z- of a complex number http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>z

In the next section, having learned how to describe both real and complex numbers, we move on to how to describe collections of numbers (vectors) and how to describe mathematical objects (matrices) that transform those vectors.

Linear algebra

In this section, we introduce notation to describe vectors and matrices, which are key mathematical objects that we will encounter again and again throughout this book.

Vectors

In many circumstances, we will want to represent a set of numbers together. For example, the numbers 7.3 and 1.2 might represent the values of two features that correspond to a data point in a training set. We often group these numbers together in brackets and write them as (7.3, 1.2) or [7.3, 1.2]. Because of the similarity to the way we write spatial coordinates, we tend to call a collection of numbers that are held together a vector. A vector can be two-dimensional, as in the example just given, or d-dimensional, meaning it contains d components, and so might look like http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>|>x1,x2,…,xd .

We can write a vector in two ways. We can write it as a row vector, going across the page, such as the following vector:

$http://www.w3.org/1998/Math/MathML> <q( close=, ), >, x_{1}, x_{2}, \dots, x_{d}) = a$ d-dimensional row vector

Eq. 8

Alternatively, we can write it as a column vector going down the page, such as the following vector:

Eq. 9

We can convert between a row vector and a column vector (and vice versa) using the transpose operator, denoted by a http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>⊤ superscript. So, the transpose of a row vector is a column vector. See the following example:

Eq. 10

And vice-versa in the following example:

Eq. 11

Symbolically, we often write a vector using a boldface font – for example, http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>bold-italic>y would mean a vector. Sometimes, we use an underline to denote a vector, so you may also see http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>false>y_ . Throughout this book, I will use an underline to denote a vector. This will make it clear when I am talking about a vector.

Matrices

Usually, we will want to transform a vector more than just transposing it. Linear transformations of vectors can be done with matrices. We will cover such transformations in Chapter 3

, but for now, we will just show how we write a matrix. A matrix is a two-dimensional array. For example, the following array is a matrix:

Eq. 12

We have used a double underline to denote the matrix http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>false>false>M__ . Note that a matrix has a double underline because it is a two-dimensional structure, while we use a single underline for a vector, which is a one-dimensional structure.

Because a matrix is a two-dimensional structure, we use two numbers to describe its size: the number of rows and the number of columns. If a matrix has R rows and C columns, we describe it as an R x C matrix. The matrix http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>false>false>M__ in Eq. 12 is a 3 x 4 matrix.

We pick out individual parts of a matrix by referring to a matrix element. The symbol http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>Mij or http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>false>false>M__ij refers to the number that is in the position of the ith row and jth column. So, for the matrix in Eq. 12, http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>M24=6 .

The matrix elements in the previous example are all integers. This need not be the case. A matrix element could be any real number. It can also be a complex number. If all the matrix elements are real, we say it is a real matrix, while if any of the matrix elements are complex, then we say the matrix is complex.

That short recap on notation for vectors and matrices is enough for now. We will meet vectors and matrices again in Chapter 3

, but for now, let’s summarize what we have learned about them.

What we learned

In this section, we have learned about the following:

How to represent a vector as a collection of multiple components (numbers)

Row vectors and column vectors and how they are related to each other via the transpose operator

How a matrix is a two-dimensional collection of components (numbers) and how the notation http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>Mij is used to pick out individual components or matrix elements

In the next section, now we have learned about various notations for individual numbers and collections of them, we move on to notation for performing operations on them. We start with the simplest operations – adding numbers together, multiplying numbers together, and taking logarithms.

Sums, products, and logarithms

In this section, we introduce notation for doing the most basic operations we can do with numbers, namely adding them together or multiplying them together. We’ll then introduce notation for working with logarithms.

Sums and the 𝚺 notation

When we want to add several numbers together, we can use the summation, or http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ , notation. For example, if we want to represent the addition of the numbers http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x1,x2,x3,x4,normal>x5 , we use the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ notation to write this as follows:

Eq. 13

This notation is shorthand for writing

. This essentially defines what the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ notation represents – that is, the following:

Eq. 14

In the left-hand side (LHS) of Eq. 14, the integer indexing variable, http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>i , takes the values between 1 (indicated beneath the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ symbol) and 5 (indicated above the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ symbol) and we interpret the LHS as "take all the numbers http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>xi for the values of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>i indicated by the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ symbol and add them together."

You may wonder whether the shorthand notation on the LHS of Eq. 14 is of any use. After all, the right-hand side (RHS) isn’t very long. However, when we want to represent the adding up of lots of numbers, then the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ notation really comes into its own. For example, if we want to add up the numbers http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x1,x2,… up to http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x100 , then we use the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ notation to write this compactly, as follows:

Eq. 15

Sometimes, we will use the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ notation to add together a set of numbers where the size of the set (the number of numbers being added together) is variable. For example, see the following notation:

∑ i=1 i=N x i

Eq. 16

This means "add together the N numbers, http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x1,x2,…,xN ." Clearly, we would get a different result for different choices of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>N . This means the expression given in Eq. 16 is a function of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>N .

Sometimes, you may see variants of the expression in the previous equation. Sometimes, a person may omit the upper value of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>i or both the lower and upper values in the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ notation because it is taken as understood what the values should naturally be. For example, you may see the following:

Eq. 17

This usually means "add up all values of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>xi in the problem we are analyzing." Similarly, the expressions http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>false>∑i=1i=Nxi and http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>false>∑i=1Nxi mean the same thing.

Finally, note that when writing sums using the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ notation, we haven’t said where the values of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>xi come from. We could in fact use the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ notation to add up the values we get after we have applied a function http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>f to the values http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>xi . In this case, we would write the following:

Eq. 18

The LHS of Eq. 18 is the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ notation way of writing the RHS. The example in Eq. 19 makes this clearer. If we set http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>N=5 so we had five numbers, http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x1,x2,x3,x4,x5 , and we want to apply the sine function to these five numbers and add them up, then we would write the following:

Eq. 19

Finally, it is worth pointing out that we can also use the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ notation to add numbers that are simple functions of the index variable http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>i . For example, using the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ notation, we can write the sum of the first 100 squares as follows:

Eq. 20

This is obviously shorthand notation for

Products and the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>bold>Π notation

Having introduced the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ notation and explained it at length, we can now introduce the complimentary idea of a concise, shorthand notation for multiplying lots of numbers together. We do this with the Π or product notation. If we want to multiply http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x1,x2,x3,x4,x5 together, we can write this as follows:

Eq. 21

As with the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ notation, we can use the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Π notation more generally. For example, we can write http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>false>∏i=1i=Nxi as shorthand for http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x1×x2×…×xN . Again, we can use the product notation as shorthand for multiplying function values together, as follows:

Eq. 22

Logarithms

Logarithms are extremely useful for describing how quickly a quantity or function grows. In particular, the logarithm tells us the exponent that describes the rate of growth of a quantity or function. Let’s make that more explicit. The logarithm to base http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>a of the number http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>ax is http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x . Mathematically, we write this as follows:

Eq. 23

The symbol http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>loga is shorthand for taking the logarithm to base http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>a . This shorthand is so common that even in the text, I will use the word log when I mean logarithm. It is also not uncommon to omit the brackets in the previous equation and write http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>logaax=x . The most common bases we use for taking logarithms are base http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>e , base 10, and base 2. Of these, base http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>e is so commonly used that we use a different symbol, http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>ln , when taking the log. So, in effect, this means http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>ln=loge . This symbol means the natural logarithm or natural log to denote the fact that taking the log to base http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>e is the most natural or common thing to do. Because taking the natural log is so common or natural, most mathematicians don’t really consider taking the log to any other base, and so by default, we use the symbol http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>log to mean http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>ln . Watch out for this. If you see the symbol http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>log without a base specified, then it either means the base is not important – for example, the proof of the mathematical statement does not depend upon the base – or base http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>e is implicitly meant. This is also the case in most computer programming languages. Applying the operator http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>log will return the natural logarithm. For example, in Python, if we use the numpy.log(y) NumPy function, we will get the natural logarithm of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>y returned.

We can see from Eq. 23 that the logarithm does in fact tell us the exponent (in base http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>a ) of the number we are taking the log of. So, if http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>logab=c , then http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>b=ac . Because taking the log effectively gives us an exponent value, the logarithm of a number is typically much smaller than the number itself. More importantly, it also means that the logarithm function is monotonic, so that http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>log|>x increases as $http://www.w3.org/1998/Math/MathML> x$ increases. The word monotonic means of one tone or of one direction, and so it means either only going up (monotonically increasing) or only going down (monotonically decreasing). This is shown in Figure 1.4, which shows the natural logarithm function http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>ln|>x, from which we can see the value of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>ln|>x increasing as http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x gets bigger:

Figure 1.4: Graph of the natural logarithm function

Figure 1.4: Graph of the natural logarithm function

An important consequence of the monotonically increasing nature of the logarithm function is that if we have a function http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>f|>x and we want to find the value of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x where http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>f|>x has its highest (or maximum) value, then that maximal value of http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x , let’s call it http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>x* , is also the point where http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>log|>f|>x has its maximal value. In mathematical notation, we can write this fact as follows:

Eq. 24

We will refer to this again in a moment.

There are well-known rules for taking logarithms of reciprocals, products, and ratios. These are (for any base):

Eq. 25

And the following:

Eq. 26

Combining these two rules, we get the rule for taking the log of a ratio:

Eq. 27

The rule for taking the log of a product is particularly useful when we have a product formed from many numbers. Using the http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Σ and http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>normal>Π notations we introduced earlier, we can write the following:

<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" display="block"><mml:msub><mml:mrow><mml:mtext>log</mml:mtext></mml:mrow><mml:mrow><mml:mi>a</mml:mi></mml:mrow></mml:msub><mml:mfenced separators="|"><mml:mrow><mml:mrow><mml:munderover><mml:mo stretchy="false">∏</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:munderover><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mrow></mml:mrow></mml:mfenced><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mtext>log</mml:mtext></mml:mrow><mml:mrow><mml:mi>a</mml:mi></mml:mrow></mml:msub><mml:mfenced separators="|"><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>…</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfenced><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mtext>log</mml:mtext></mml:mrow><mml:mrow><mml:mi>a</mml:mi></mml:mrow></mml:msub><mml:mfenced separators="|"><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:mfenced><mml:mo>+</mml:mo><mml:msub><mml:mrow><mml:mtext>log</mml:mtext></mml:mrow><mml:mrow><mml:mi>a</mml:mi></mml:mrow></mml:msub><mml:mfenced separators="|"><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:mfenced><mml:mo>+</mml:mo><mml:mo>…</mml:mo><mml:mo>+</mml:mo><mml:msub><mml:mrow><mml:mtext>log</mml:mtext></mml:mrow><mml:mrow><mml:mi>a</mml:mi></mml:mrow></mml:msub><mml:mfenced separators="|"><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfenced><mml:mi> </mml:mi><mml:mi> </mml:mi><mml:mi> </mml:mi><mml:mi> </mml:mi><mml:mi> </mml:mi><mml:mi> </mml:mi><mml:mi> </mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:munderover><mml:mo stretchy="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>2</mml:mn></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:munderover><mml:mrow><mml:msub><mml:mrow><mml:mtext>log</mml:mtext></mml:mrow><mml:mrow><mml:mi>a</mml:mi></mml:mrow></mml:msub><mml:mfenced separators="|"><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfenced></mml:mrow></mml:mrow></mml:math>

Eq. 28

This, in conjunction with the fact that taking the log is a monotonic transformation, will be very useful to us when we start to use the concept of maximum likelihood to build probabilistic models in Chapter 5

We will make lots of use of sums, products, and logarithms throughout this book, but we have all the notation we need to work with them, so let’s summarize what we have learned about that notation.

What we learned

In this section, we have learned about the following:

The Σ notation for adding lots of numbers together

The Π notation for multiplying lots of numbers together

How we can also use the Σ and Π notations when we have a function, http://www.w3.org/1998/Math/MathML xmlns:m=http://schemas.openxmlformats.org/officeDocument/2006/math>f(x) , applied

Enjoying the preview?

Page 1 of 1

15 Math Concepts Every Data Scientist Should Know: Understand and learn how to apply the math behind data science algorithms

David Hoyle

Read more from David Hoyle

A Year of Grace: Exploring the Christian seasons

The Continuing Dialogue: An Investigation into the Artistic Afterlife of the Five Narratives Peculiar to the Fourth Gospel and an Assessment of Their Contribution to the Hermeneutics of that Gospel

The Noble Army: The Modern Martyrs of Westminster Abbey

Related authors

Related to 15 Math Concepts Every Data Scientist Should Know

Related ebooks

Principles of Data Science: A beginner's guide to essential math and coding skills for data fluency and machine learning

Bayesian Analysis with Python: A practical guide to probabilistic modeling

Start Predicting In A World Of Data Science And Predictive Analysis

An Introduction to Statistical Computing: A Simulation-based Approach

Data Science Career Guide Interview Preparation

Python Machine Learning Complete Self-Assessment Guide

The Malliavin Calculus

Mastering Time Series Analysis and Forecasting with Python: Bridging Theory and Practice Through Insights, Techniques, and Tools for Effective Time Series Analysis in Python

Pearls in Graph Theory: A Comprehensive Introduction

Ben Graham Was a Quant: Raising the IQ of the Intelligent Investor

macOS Big Sur Demystified: Most Well-guarded Secrets to Crack macOS Big Sur to Pro Level Revealed

Intelligent Computational Systems: A Multi-Disciplinary Perspective

Sketches in Quantitative Finance A Translation of Bachelier's Le Jeu, la Chance et le Hasard

Wavelet Neural Networks: With Applications in Financial Engineering, Chaos, and Classification

Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization

Data Engineering Best Practices: Architect robust and cost-effective data solutions in the cloud era

DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB

Machine Learning: Hands-On for Developers and Technical Professionals

Professional Python

F# for Machine Learning Essentials: Get up and running with machine learning with F# in a fun and functional way

Alternating Decision Tree: Fundamentals and Applications

Practice Makes Perfect Linear Algebra: With 500 Exercises

PyTorch Cookbook: 100+ Solutions across RNNs, CNNs, python tools, distributed training and graph networks

C++ for Financial Engineers Complete Self-Assessment Guide

Dynamic Bayesian Networks: Fundamentals and Applications

Machine Learning for the Web

Synthetic Data Generation: A Beginner’s Guide

Group Theory in Solid State Physics and Photonics: Problem Solving with Mathematica

A Short Course in Automorphic Functions

Peak Performance: Nutrition Strategies for Athletes

Programming For You

Python Programming For Beginners: Learn The Basics Of Python Programming (Python Crash Course, Programming for Dummies)

Learn SQL in 24 Hours

Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees

Coding All-in-One For Dummies

Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1

Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence

Python: Learn Python in 24 Hours

JavaScript All-in-One For Dummies

Microsoft Azure For Dummies

SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL

Python Programming for Beginners: A Comprehensive Crash Course With Practical Exercises to Quickly Learn Coding and Programming for Data Analysis and Machine Learning

Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time!

Microsoft Office 365 Bible: 10:1 Mastery | Excel in Your Profession, Enhance Time Management, and Foster Exceptional Collaboration [III EDITION]

Python Programming : How to Code Python Fast In Just 24 Hours With 7 Simple Steps

Algorithms For Dummies

Linux: Learn in 24 Hours

The Advanced Roblox Coding Book: An Unofficial Guide, Updated Edition: Learn How to Script Games, Code Objects and Settings, and Create Your Own World!

The JavaScript Workshop: Learn to develop interactive web applications with clean and maintainable JavaScript code

Learn to Code. Get a Job. The Ultimate Guide to Learning and Getting Hired as a Developer.

Game Development with Unreal Engine 5: Learn the Basics of Game Development in Unreal Engine 5 (English Edition)

Godot from Zero to Proficiency (Foundations): Godot from Zero to Proficiency, #1

SQL All-in-One For Dummies

Learn AI with Python: Explore Machine Learning and Deep Learning techniques for Building Smart AI Systems Using Scikit-Learn, NLTK, NeuroLab, and Keras (English Edition)

PYTHON PROGRAMMING

PYTHON: Practical Python Programming For Beginners & Experts With Hands-on Project

The Ultimate Roblox Book: An Unofficial Guide, Updated Edition: Learn How to Build Your Own Worlds, Customize Your Games, and So Much More!

Beginning Programming with C++ For Dummies

Learn NodeJS in 1 Day: Complete Node JS Guide with Examples

Python Data Structures and Algorithms

Learn Python Programming for Beginners: The Best Step-by-Step Guide for Coding with Python, Great for Kids and Adults. Includes Practical Exercises on Data Analysis, Machine Learning and More.

Related categories

Reviews for 15 Math Concepts Every Data Scientist Should Know

What did you think?

Book preview

15 Math Concepts Every Data Scientist Should Know - David Hoyle