100% found this document useful (1 vote)

27 views81 pages

Introduction To Data Science in Finance

Uploaded by

Rishi Sant

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

27 views81 pages

Introduction To Data Science in Finance

Uploaded by

Rishi Sant

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 81

Data Science and Analytics

for Finance Course

Introduction to Data Science
in Finance
What is Data Science ?
scientifi
c
method
s

Data
Scienc
e
algorithm
processe
s, and
s systems
Key components of Data Science

Data Collection

Data Preparation

Data Analysis

Data Visualization

Machine Learning
Data Science: A Multifaceted Field
Comput
er
Science
Domain
Statistic
Expertis
s
e
Data
Scienc
e
Statistics

DATA ANALYSIS PROBABILITY HYPOTHESIS

THEORY TESTING
Computer Science

Programming Algorithms Database

Management
Domain Expertise

Contextual Problem Framing Collaboration

Understanding
The Interplay

A statistician might develop a new method for A domain expert in healthcare might identify a
analyzing time series data, but a computer need to predict patient outcomes, while a data
scientist would implement it in efficient scientist would use machine learning to build a
software. predictive model.
Tools and Technologies

PROGRAMMING LIBRARIES AND DATA VISUALIZATION CLOUD PLATFORMS: AWS,

LANGUAGES: PYTHON, R, FRAMEWORKS: NUMPY, TOOLS: MATPLOTLIB, AZURE, GCP
SQL PANDAS, SCIKIT-LEARN, SEABORN, TABLEAU
TENSORFLOW
Data Science
Revolutionizing
Financial Services
Financial
Statement
Analysis
• Automated data extraction: Data
science algorithms can extract
data from various sources,
including financial statements,
contracts, and invoices, reducing
manual effort and errors.
• Pattern recognition: By analyzing
historical financial data, data
scientists can identify trends,
anomalies, and potential risks that
may not be apparent to human
auditors.
Fraud
Detection
• Real-time anomaly detection:
Data science algorithms can
analyze vast amounts of
transaction data in real-time,
identifying suspicious patterns
that may indicate fraudulent
activity.
• Machine learning models:
Advanced models can learn from
historical fraud data to predict
future fraudulent attempts,
improving detection rates and
reducing losses.

This Photo by Unknown Author is licensed under CC BY-S

Tax
Compliance
• Automated tax return
preparation: Data science can
automate the process of
preparing and filing tax
returns, reducing errors and
improving efficiency.
• ax planning optimization: By
analyzing historical tax data
and current tax laws, data
scientists can help businesses
identify tax-saving
opportunities.
Credit
Scoring
• Enhanced risk assessment: Data
science enables lenders to create
more accurate and comprehensive
credit scores by incorporating a
wider range of data points beyond
traditional credit history.
• Predictive analytics: By analyzing
customer behavior and financial
patterns, data scientists can
predict the likelihood of default,
helping lenders make more
informed decisions.

This Photo by Unknown Author is licensed under CC BY-SA-N

Financial Forecasting

Improved accuracy: Data

Scenario analysis: Data
science models can provide
scientists can help businesses
more accurate financial
evaluate the potential impact of
forecasts by incorporating a
different economic scenarios on
wider range of data points and
their financial performance.5.
using advanced statistical
Auditing
techniques.
Quantitative Analyst

Data Engineer
Key Roles in Financial Data Scientist
Finance
Risk Analyst
Using Data
Algorithmic Trader
Science
Financial Modeler

Regulatory Data Analyst

Data Science
Process in
Finance

• Step 1 - Data Collection

• Gathering relevant data: This
step involves collecting data
from various sources, such as
financial statements, market
data, customer information, and
economic indicators.
• Data quality assessment:
Ensuring the data is accurate,
complete, and consistent is
crucial for the subsequent steps.
• Step 2 - Data Preparation
• Cleaning and preprocessing:
Removing outliers, handling missing
values, and transforming data into a
suitable format for analysis.
• Feature engineering: Creating new
features or transforming existing ones
to improve model performance.
• Step 3 - Data Exploration
• Exploratory data analysis (EDA): Using
statistical techniques and
visualization tools to understand the
data's characteristics, identify
patterns, and uncover potential
relationships.
• Data visualization: Creating charts,
graphs, and other visualizations to
help stakeholders understand the
data more easily.
• Step 4 – Modeling
• Selecting appropriate models: Choosing the most suitable models
based on the problem at hand and the data's characteristics.
• Model training and evaluation: Training the models on the prepared
data and evaluating their performance using appropriate metrics.
• Step 5 – Interpretation and
Deployment
• Interpreting model results:
Understanding the insights provided
by the models and explaining their
implications.
• Deploying models: Integrating the
models into production systems for
real-time or batch processing.
Data Types in
Finance: Structured
• Data organized in a predefined format, such as
rows and columns in a spreadsheet or
database table.
• Examples - Financial statements (income
statements, balance sheets, cash flow
statements), Market data (stock prices, interest
rates, exchange rates) and Customer
information (name, address, contact details,
transaction history).
• Advantages - Easy to store, retrieve, and
analyze using traditional database
management systems.
• Challenges - May not capture the full
complexity of financial phenomena, especially
when dealing with unstructured information.
Data Types in Finance:
Unstructured

• Data that does not have a predefined structure

and is difficult to store in a traditional
database.
• Examples - Textual data (news articles, social
media posts, research papers), Audio data
(customer calls, market commentary) and
Video data (security footage, market analysis
presentations)
• Advantages - Rich in information, can provide
valuable insights when analyzed effectively.
• Challenges - Difficult to process and analyze
due to its unstructured nature, requiring
advanced techniques like natural language
processing and machine learning.
Data Types in Finance:
Semi-Structured

• Data that has some underlying structure but is

not strictly organized in a predefined format.
• Examples - XML and JSON files (used to store
structured data in a hierarchical format) and
Email messages (contain structured elements
like headers and body text, but also
unstructured content)
• Advantages - Offers a balance between
structured and unstructured data, allowing for
easier analysis while preserving rich
information.
• Challenges - May require specialized tools and
techniques for efficient processing and
analysis.
Internal Financial
Data Sources

• Accounting records
• Financial statements
• Management reports
External
Financial Data
Sources
• Financial databases
• Government agencies
• Industry associations
• Research firms
Importance of Clean
Data in Finance

• Accurate Financial Reporting –

• Regulatory compliance: Clean data is
essential for complying with financial
regulations and standards, such as
Generally Accepted Accounting
Principles (GAAP) or International
Financial Reporting Standards (IFRS).
• Investor confidence: Accurate financial
reporting builds trust with investors
and stakeholders, leading to increased
market valuation and access to capital.
Effective Risk
Management

• Identifying risks: Clean data helps

identify and assess potential financial
risks, such as credit risk, market risk,
and operational risk.
• Mitigating risks: Accurate data enables
the development of effective risk
management strategies to minimize
losses and protect financial stability.
Informed Decision Making

• Data-driven insights: Clean data

provides the foundation for data-driven
decision-making, enabling financial
professionals to make informed
choices about investments, risk
management, and strategic planning.
• Improved efficiency: Clean data can
streamline processes, reduce manual
errors, and improve overall operational
efficiency.
Enhanced Regulatory
Compliance

• Meeting requirements: Clean data

helps financial institutions meet
regulatory requirements, such as those
related to anti-money laundering (AML)
and know-your-customer (KYC)
regulations.
• Avoiding penalties: Inaccurate or
incomplete data can lead to regulatory
fines and penalties, which can have a
significant impact on a financial
institution's reputation and bottom
line.
Improved Customer
Experience

• Personalized services: Clean customer

data enables financial institutions to
offer personalized products and
services, enhancing customer
satisfaction and loyalty.
• Efficient operations: Accurate customer
data can streamline operations, reduce
errors, and improve overall customer
experience.
Common Data Quality Issues:
Missing Data
Incomplete transaction records

Data corruption

Privacy concerns

Data integration
Techniques for Handling
Missing Data

• 1. Removal
• Listwise deletion: Remove all
observations with missing values. This
method can be effective if the number
of missing values is small and the data
is not heavily skewed.
• Pairwise deletion: Exclude observations
with missing values only for the
specific analysis or calculation. This
method can be more efficient than
listwise deletion but may introduce bias
if the missing values are not random.
2. Imputation

• Mean/median/mode imputation:
Replace missing values with the mean,
median, or mode of the respective
variable. This method is simple but can
introduce bias if the data is not
normally distributed.
• Hot deck imputation: Replace missing
values with values from a randomly
selected donor observation with similar
characteristics.
• Cold deck imputation: Replace
missing values with values from a
predetermined donor observation.
• Regression imputation: Use
regression analysis to predict missing
values based on other variables in
the dataset.
• Multiple imputation: Create multiple
complete datasets by imputing
missing values using different
methods and combining the results.
Common Data Quality
Issues: Outliers in
Financial Data

• Outliers are data points that

significantly deviate from the
majority of the data. In finance,
they can represent unusual
events, errors, or fraudulent
activities. Identifying and
handling outliers is crucial for
accurate analysis and reliable
results.
Identifying
Outliers

• z-scores - Calculate the z-score

for each data point by
subtracting the mean from the
value and dividing by the
standard deviation.
• Outliers are typically defined as
data points with z-scores greater
than a certain threshold (e.g., 3
or 4).
• Interquartile Range (IQR) - Calculate the IQR by
subtracting the first quartile (Q1) from the third quartile
(Q3).
• Identify outliers using the following formula –
• Lower fence: Q1 - 1.5 * IQR
• Upper fence: Q3 + 1.5 * IQR
Treatment of Outliers
• Winsorizing: Replace outliers with the nearest non-
outlier value.
• This method preserves the overall distribution but can
introduce bias if there are many outliers.
• Capping: Replace outliers with a predetermined
maximum or minimum value.
• This method can be useful when outliers represent
extreme values that are unlikely to be accurate.
• Removing extreme values: Remove outliers from the
dataset entirely.
• This method can be appropriate if outliers are clearly
erroneous or have a significant impact on the analysis.
Common Data
Quality Issues:
Duplicate Data

• Duplicate data is another

common issue in financial
datasets. It can occur due to
various reasons, such as:
• Data entry errors
• Data integration
• Data migration
Identifying
Duplicate Entries
• Exact matches - Identify records with
identical values for all relevant fields
(e.g., customer ID, transaction ID).
• Fuzzy matching: Use algorithms to
identify records with similar but not
identical values, such as variations in
names or addresses.
• Record linkage: Combine information
from multiple sources to identify
records that refer to the same entity.
Handling
Duplicate Entries

• Merging: Combine the duplicate

records into a single record,
preserving relevant information
from both.
• Deleting: Remove one or both of
the duplicate records, ensuring
that the remaining record
contains the most accurate and
up-to-date information.
• Flagging: Mark duplicate records
as such for further investigation
or correction
Standardizing
Financial Data

• Standardizing financial data is

essential for accurate analysis,
comparison, and modeling. It
involves converting data into a
consistent format and scaling or
normalizing it to ensure
comparability and improve
model performance.

This Photo by Unknown Author is licensed under CC BY-S

Converting
Different Currencies
or Formats

• Currency conversion
• Date format standardization
• Decimal separator
standardization

This Photo by Unknown Author is licensed under CC B

Data
Normalization
and Scaling

• Normalization: Rescales data to

a specific range, typically
between 0 and 1. This is useful
when dealing with data with
different units or magnitudes.
• Scaling: Transforms data to a
specific range or scale, such as
standard scores (z-scores). This
can be helpful for improving
model performance and
preventing certain types of bias.
Common
normalization and
scaling techniques

• Min-max scaling
• Z-score normalization
• Robust scaling This Photo by Unknown Author is licensed under CC BY-NC
Descriptive Statistics
in Finance: Central
Tendency

• Central tendency measures

provide a summary of the
typical or average value in a
dataset. In finance, these
measures are commonly used
to analyze financial data, such
as stock returns, market
indices, and risk metrics.
This Photo by Unknown Author is licensed under CC BY-NC-ND
Mean

• The sum of all values divided by the

number of observations.
• Mean = (Sum of all values) / (Number
of observations)
Median
• The middle value in a
dataset when the values
are arranged in
ascending order.
• If the number of
observations is odd, the
median is the middle
value.
• If the number of
observations is even, the
median is the average of This Photo by Unknown Author is licensed under CC BY-SA-NC

the two middle values.

Mode

• The most frequently occurring

value in a dataset.
• Identify the value(s) that appear
most often.
Descriptive
Statistics in Finance:
Measures of Spread

• Measures of spread, also known

as dispersion or variability,
quantify how much the data
points in a dataset vary from the
central tendency. In finance,
these measures are crucial for
understanding the risk or
volatility of an asset.
Variance

• The average squared deviation

from the mean.
• Variance = (Sum of (value -
mean)^2) / (Number of
observations - 1)
• A higher variance indicates
greater dispersion of the data
points from the mean,
suggesting higher volatility or
risk.

This Photo by Unknown Author is licensed under CC BY-S

Standard
Deviation
• The square root of the
variance.
• Standard Deviation =
√(Variance)
• The standard deviation is a
more interpretable
measure of spread than
variance, as it is expressed
in the same units as the This Photo by Unknown Author is licensed under CC BY-NC-ND

original data. A higher

standard deviation
indicates greater volatility
or risk.
Descriptive Statistics in Finance:
Distribution of Returns
• Understanding the distribution of returns is essential in
finance for assessing risk, evaluating investment
strategies, and making informed decisions. Histograms
and analyzing the shape of the data, including normality
and skewness, are key tools for this purpose.
Understandi
ng the Shape
of Data
• Normal Distribution: A bell-
shaped curve where the
mean, median, and mode
are equal. It is often
assumed in financial
modeling.
• Skewness: Measures the
asymmetry of the
distribution This Photo by Unknown Author is licensed under CC BY-NC

• Positive skewness
(right-skewed)
• Negative skewness
(left-skewed)
Introduction to
Financial Data
Visualization
Line
Charts
• A line chart connects
data points with
lines, creating a
visual representation
of how a variable
changes over time.
In finance, this is
often used to track This Photo by Unknown Author is licensed under CC BY-SA

the price of a stock,

index, or other
financial asset.
X-axis: Y-axis:
Represents Represents
time, typically the value of
in days, the financial
Key weeks,
months, or
asset (e.g.,
stock price,
componen years. index level)

ts of a line Data points:

Line: Connects
the data
chart
Represent the
points,
value of the
showing the
asset at
trend of the
specific points
asset's value
in time.
over time.
Interpreting Line Charts

UPWARD TREND DOWNWARD VOLATILITY

TREND
Introduction to
Financial Data
Visualization:
Scatter Plots

• Scatter plots are a

valuable tool for
visualizing the
relationship between
two variables. In
finance, they are
often used to
analyze the
relationship between
risk and return. This Photo by Unknown Author is licensed under CC BY-SA-NC
X-axis: Represents the risk
measure, typically standard
Key deviation or beta.
components
of a risk vs. Y-axis: Represents the return,
return typically measured as
scatter plot: annualized return or excess
return.

Data points: Represent

individual assets or portfolios.
Upward slope: A general upward slope
indicates that assets with higher risk
tend to have higher returns. This is
consistent with the concept of the risk-
Interpreting return trade-off.
Risk vs. Clustering: If the data points cluster in a
Return specific area, it suggests that there may
be a relationship between risk and
Scatter Plots return within that cluster.

Outliers: Outliers are data points that are

significantly different from the majority
of the data. They may represent assets
with unusually high or low risk-return
profiles.
Introduction to
Financial Data
Visualization:
Histograms

• Histograms are a versatile

tool for visualizing the
distribution of numerical
data. In finance, they are
commonly used to
understand the distribution
of returns or trading
volumes.

This Photo by Unknown Author is licensed under CC BY-S

X-axis: Represents the range of
values (e.g., returns or trading
Key volumes).
component
s of a Y-axis: Represents the
histogram frequency of occurrence.

Bars: Represents the number

of observations within each
range.
Shape: The shape of the histogram can be
described as normal, skewed, bimodal, or
other patterns.

Central tendency: The peak of the

Interpretin histogram represents the most common
g value.

histograms Spread: The width of the histogram

indicates the dispersion of the data.

Outliers: Outliers can be identified as data

points that are far from the main body of
the data.
Analyzing Trends and
Patterns
Trend Analysis and Seasonality in
Financial Time Series Data
• Financial time series data, such as stock prices,
exchange rates, and interest rates, often exhibit
patterns and trends that can be identified and exploited
for investment or risk management purposes. Two
common patterns to analyze are trends and seasonality.

This Photo by Unknown Author is licensed under CC BY-SA-NC

Trend
Analysis
• Identifying the long-term
direction of a time series,
such as upward (uptrend),
downward (downtrend), or
sideways (sideways trend).

This Photo by Unknown Author is licensed under CC BY-S

Methods
• Moving averages: Calculate averages of data points
over a specified window to smooth out short-term
fluctuations and identify underlying trends.
• Regression analysis: Fit a regression line to the data to
estimate the trend and its slope.
• Differencing: Transform the data by taking the
difference between consecutive observations to make
the series stationary (remove trends and seasonality).
Seasonality
• Patterns that repeat at
regular intervals, such as
daily, weekly, monthly, or
yearly.

This Photo by Unknown Author is licensed under CC B

Detection
• Visual inspection: Examine
the time series plot for
recurring patterns.
• Statistical methods: Use
techniques like Fourier
analysis or seasonal
decomposition to identify
and quantify seasonal
components.
Combining Trend and
Seasonality Analysis

• Decomposition: Break down a time

series into its trend, seasonal, and
residual components.
• Forecasting: Use the identified
trend and seasonal patterns to
forecast future values.
• Risk management: Identify and
manage risks associated with
seasonal fluctuations.
Identifying Short-
Term vs. Long-
Term Patterns

• When analyzing financial

time series data, it's
crucial to differentiate
between short-term and
long-term patterns. These
patterns can provide
valuable insights into
market dynamics and
inform investment
decisions.
Short-Term Patterns

• Intraday price fluctuations

• Short-term market corrections
• Technical analysis indicators
Long-Term Patterns

• Secular trends
• Market cycles
• Fundamental analysis

Data Analytics Lecture Notes
100% (1)
Data Analytics Lecture Notes
10 pages
Data Distributions and Analysis Worksheet
No ratings yet
Data Distributions and Analysis Worksheet
9 pages
Kantar - Consultant Interview Questions
No ratings yet
Kantar - Consultant Interview Questions
11 pages
A Beginners Guide To Data and Analytics
100% (1)
A Beginners Guide To Data and Analytics
22 pages
Module 4 Intro
No ratings yet
Module 4 Intro
34 pages
Data Analytics For Accountants Canadian Accountants
No ratings yet
Data Analytics For Accountants Canadian Accountants
7 pages
Data Analytics Value Chain
No ratings yet
Data Analytics Value Chain
5 pages
Trends in Data Science: AI and DS-I
No ratings yet
Trends in Data Science: AI and DS-I
32 pages
Financial Analytics Training
No ratings yet
Financial Analytics Training
49 pages
Week 1
No ratings yet
Week 1
50 pages
HubSpots Guide To Data Analytics
No ratings yet
HubSpots Guide To Data Analytics
50 pages
FMDA Theory Part 2
No ratings yet
FMDA Theory Part 2
10 pages
Data Science in The BFSI Domain: Transforming Financial Services
No ratings yet
Data Science in The BFSI Domain: Transforming Financial Services
19 pages
Smart Banking: Unlocking Insights Through Data Logic
100% (1)
Smart Banking: Unlocking Insights Through Data Logic
13 pages
Chapter 9
No ratings yet
Chapter 9
15 pages
Data Analytics (Lecture 20 & 21) Notes
No ratings yet
Data Analytics (Lecture 20 & 21) Notes
3 pages
Data Analytics Notes (Autorecovered)
No ratings yet
Data Analytics Notes (Autorecovered)
60 pages
Manan 1
No ratings yet
Manan 1
65 pages
Screenshot 2025-04-23 at 8.26.12 AM
No ratings yet
Screenshot 2025-04-23 at 8.26.12 AM
14 pages
1 1 Intro To Data and Data Science Course Notes
No ratings yet
1 1 Intro To Data and Data Science Course Notes
8 pages
Unit 1 - Introduction (Data Analytics and Big Data) - 60515294 - 2025 - 05 - 15 - 17 - 42
No ratings yet
Unit 1 - Introduction (Data Analytics and Big Data) - 60515294 - 2025 - 05 - 15 - 17 - 42
25 pages
Data Science in Finance
No ratings yet
Data Science in Finance
9 pages
Data Science in Finance
No ratings yet
Data Science in Finance
83 pages
Kantar Consultant Interview Questions 1
No ratings yet
Kantar Consultant Interview Questions 1
11 pages
Introduction To Data Analytics: Data Analysis Using Python - Project Report
100% (1)
Introduction To Data Analytics: Data Analysis Using Python - Project Report
1 page
Business Analytics
No ratings yet
Business Analytics
33 pages
Data Analytics and Visualization Unit-I
No ratings yet
Data Analytics and Visualization Unit-I
25 pages
0001 - How To Succeed in Data Analytics - 09252019
No ratings yet
0001 - How To Succeed in Data Analytics - 09252019
2 pages
Data Science in Finance
No ratings yet
Data Science in Finance
15 pages
Business Data Analytics
No ratings yet
Business Data Analytics
28 pages
Data Analytics Beginners Guide - Shared by WorldLine Technology
100% (2)
Data Analytics Beginners Guide - Shared by WorldLine Technology
22 pages
It App - Finals Notes
No ratings yet
It App - Finals Notes
60 pages
Introduction To Data Analytics
No ratings yet
Introduction To Data Analytics
33 pages
Unit 3
No ratings yet
Unit 3
18 pages
Unit 1 - Exploratory Data Analysis Fundamentals
No ratings yet
Unit 1 - Exploratory Data Analysis Fundamentals
47 pages
Business Analytics Summary (Units 1.2 - 1.8)
No ratings yet
Business Analytics Summary (Units 1.2 - 1.8)
8 pages
Session1 DataCharacteristics
No ratings yet
Session1 DataCharacteristics
41 pages
Financial Analytics Notes
100% (1)
Financial Analytics Notes
40 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
7 pages
Data Similarity and Dissimilarity
No ratings yet
Data Similarity and Dissimilarity
73 pages
Introduction To Data Analytics
No ratings yet
Introduction To Data Analytics
19 pages
Lecture 1 Ok
No ratings yet
Lecture 1 Ok
35 pages
Fundamentals of Datascience
No ratings yet
Fundamentals of Datascience
81 pages
1 Introduction To Data Analytics
No ratings yet
1 Introduction To Data Analytics
14 pages
Data Analytics 1
No ratings yet
Data Analytics 1
4 pages
Data Analytics-Wps Office
No ratings yet
Data Analytics-Wps Office
21 pages
Data Analytics 1
No ratings yet
Data Analytics 1
3 pages
Data Science Tutorial 1
No ratings yet
Data Science Tutorial 1
26 pages
01 - Data Analytics in Accounting and Business
50% (2)
01 - Data Analytics in Accounting and Business
18 pages
Data Analytics - 4 Manuscripts - Data Science For Beginners, Data Analysis With Python, SQL Computer Programming For Beginners, Statistics For Beginners
100% (1)
Data Analytics - 4 Manuscripts - Data Science For Beginners, Data Analysis With Python, SQL Computer Programming For Beginners, Statistics For Beginners
481 pages
Data Analytics - 1
No ratings yet
Data Analytics - 1
21 pages
Data Analytics - Beginner's Guide
No ratings yet
Data Analytics - Beginner's Guide
22 pages
Chapter 1
No ratings yet
Chapter 1
34 pages
Notes Data Science With Python 1
No ratings yet
Notes Data Science With Python 1
18 pages
Data Analytics Unit1
No ratings yet
Data Analytics Unit1
24 pages
Data Science
No ratings yet
Data Science
207 pages
Reviewer - Data Analytics
No ratings yet
Reviewer - Data Analytics
5 pages
Big Data Day II
No ratings yet
Big Data Day II
38 pages
DA 1st Week
No ratings yet
DA 1st Week
3 pages
Stata
No ratings yet
Stata
33 pages
CMA Raj Notes - Variance Analysis 2
No ratings yet
CMA Raj Notes - Variance Analysis 2
34 pages
Study of Online Games and Their Players
No ratings yet
Study of Online Games and Their Players
33 pages
E-Commerce Lab 18101036 Bba 3
No ratings yet
E-Commerce Lab 18101036 Bba 3
8 pages
Ecommerce Lab Manual Final
100% (1)
Ecommerce Lab Manual Final
47 pages
Jan 2022 CSEC Maths P2 Solutions
100% (2)
Jan 2022 CSEC Maths P2 Solutions
33 pages
MGMT 650 HW2 Fall 2023
No ratings yet
MGMT 650 HW2 Fall 2023
34 pages
BComp3 Module 5 Measures of Variability
No ratings yet
BComp3 Module 5 Measures of Variability
17 pages
Math AI SL IA 5
No ratings yet
Math AI SL IA 5
21 pages
Saveetha Institute of Medical and Technical Sciences: Unit V Plotting and Regression Analysis in R
No ratings yet
Saveetha Institute of Medical and Technical Sciences: Unit V Plotting and Regression Analysis in R
63 pages
Scaling Techniques
No ratings yet
Scaling Techniques
30 pages
Univariate Analysis
No ratings yet
Univariate Analysis
10 pages
Statistics Volume 1 - 47368786 - 2024 - 12 - 06 - 19 - 23
No ratings yet
Statistics Volume 1 - 47368786 - 2024 - 12 - 06 - 19 - 23
95 pages
Math 10-Q4-Module-3
No ratings yet
Math 10-Q4-Module-3
13 pages
SLG 5.2 Box Plots
No ratings yet
SLG 5.2 Box Plots
9 pages
Excel - Advanced Interview Questions
No ratings yet
Excel - Advanced Interview Questions
9 pages
Statistics For Economics Class 11 Notes Chapter 1 Introduction
No ratings yet
Statistics For Economics Class 11 Notes Chapter 1 Introduction
36 pages
(Ebook PDF) Modern Business Statistics, With Microsoft Office Excel 4th Edition PDF Download
100% (2)
(Ebook PDF) Modern Business Statistics, With Microsoft Office Excel 4th Edition PDF Download
48 pages
Ib A&i 3.1
No ratings yet
Ib A&i 3.1
38 pages
Analysis of The Spatial Climate Structure in Vitic - Worldwide
No ratings yet
Analysis of The Spatial Climate Structure in Vitic - Worldwide
8 pages
Chapter 4 pt.4
No ratings yet
Chapter 4 pt.4
19 pages
A Clockwork Orange Essay
No ratings yet
A Clockwork Orange Essay
51 pages
Lecture Notes MCP Ungrouped Quartiles Mendelhall and Sincich
No ratings yet
Lecture Notes MCP Ungrouped Quartiles Mendelhall and Sincich
3 pages
Merits and Demerits
No ratings yet
Merits and Demerits
10 pages
DSBDL Asg 3 Write Up
No ratings yet
DSBDL Asg 3 Write Up
6 pages
Decilesfor Grouped Data
No ratings yet
Decilesfor Grouped Data
9 pages
JCC Answer Final Exam 2020
No ratings yet
JCC Answer Final Exam 2020
8 pages
Example 1: Calculate P61
No ratings yet
Example 1: Calculate P61
8 pages
Statistics: Measure of Central Tendency Mean
No ratings yet
Statistics: Measure of Central Tendency Mean
25 pages
Quantitative Techniques
No ratings yet
Quantitative Techniques
89 pages
Applied Biostatistics 2020 - 01 Basics, Centrality and Dispersion
No ratings yet
Applied Biostatistics 2020 - 01 Basics, Centrality and Dispersion
86 pages
Statitics by Mesfin
No ratings yet
Statitics by Mesfin
150 pages
Ungrouped Data
No ratings yet
Ungrouped Data
60 pages

Introduction To Data Science in Finance

Uploaded by

Introduction To Data Science in Finance

Uploaded by

Data Science and Analytics

for Finance Course

DATA ANALYSIS PROBABILITY HYPOTHESIS

Programming Algorithms Database

Contextual Problem Framing Collaboration

PROGRAMMING LIBRARIES AND DATA VISUALIZATION CLOUD PLATFORMS: AWS,

This Photo by Unknown Author is licensed under CC BY-S

This Photo by Unknown Author is licensed under CC BY-SA-N

Improved accuracy: Data

Regulatory Data Analyst

• Step 1 - Data Collection

• Data that does not have a predefined structure

• Data that has some underlying structure but is

• Accurate Financial Reporting –

• Identifying risks: Clean data helps

• Data-driven insights: Clean data

• Meeting requirements: Clean data

• Personalized services: Clean customer

• Outliers are data points that

• z-scores - Calculate the z-score

• Duplicate data is another

• Merging: Combine the duplicate

• Standardizing financial data is

This Photo by Unknown Author is licensed under CC BY-S

This Photo by Unknown Author is licensed under CC B

• Normalization: Rescales data to

• Central tendency measures

• The sum of all values divided by the

the two middle values.

• The most frequently occurring

• Measures of spread, also known

• The average squared deviation

This Photo by Unknown Author is licensed under CC BY-S

original data. A higher

the price of a stock,

ts of a line Data points:

UPWARD TREND DOWNWARD VOLATILITY

• Scatter plots are a

Data points: Represent

Outliers: Outliers are data points that are

• Histograms are a versatile

This Photo by Unknown Author is licensed under CC BY-S

Bars: Represents the number

Central tendency: The peak of the

histograms Spread: The width of the histogram

Outliers: Outliers can be identified as data

This Photo by Unknown Author is licensed under CC BY-SA-NC

This Photo by Unknown Author is licensed under CC BY-S

This Photo by Unknown Author is licensed under CC B

• Decomposition: Break down a time

• When analyzing financial

• Intraday price fluctuations

You might also like