0% found this document useful (0 votes)
3 views62 pages

Week 2 Lecture 3

The document outlines the landscape of data analytics in the oil and gas industry, focusing on production forecasting, maintenance, and data science best practices. It emphasizes the importance of building the right data science team with diverse skills and roles, and discusses methodologies such as Agile for project management. Additionally, it highlights emerging technologies like AI, machine learning, and their applications in various sectors.

Uploaded by

parth25stat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views62 pages

Week 2 Lecture 3

The document outlines the landscape of data analytics in the oil and gas industry, focusing on production forecasting, maintenance, and data science best practices. It emphasizes the importance of building the right data science team with diverse skills and roles, and discusses methodologies such as Agile for project management. Additionally, it highlights emerging technologies like AI, machine learning, and their applications in various sectors.

Uploaded by

parth25stat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 62

Data Analytics

Landscape

6/25/22 11:59 AM 1
Oil and Gas

05/18/2025 07:38 AM 2
.

05/18/2025 07:38 AM 3
Production Forecasting

05/18/2025 07:38 AM 4
Production Forecasting and Well shut-ins

05/18/2025 07:38 AM 5
.

05/18/2025 07:38 AM 6
Predict/Detect Screen-Outs

05/18/2025 07:38 AM 7
Predict Hydraulic Pump Failure

05/18/2025 07:38 AM 8
.

05/18/2025 07:38 AM 9
Offline Maintenance Scope Setting

05/18/2025 07:38 AM 10
Discussion

6/25/22 11:59 AM 11
Best Practices To Execute Data Science Projects?

05/18/2025 07:38 AM 12
Best Practices To Execute Data Science Projects?

Identify your key Create an effective Emphasize Expand the impact


business drivers for team communications through visualization
data science skills and storytelling

Give the data Pave the way for Improve governance


scientists all the data operationalizing the to avoid data
analytics science “creepiness”

05/18/2025 07:38 AM 13
People

People Science
Data

Data Analytics

Technology Processes Business

14
BUILDING RIGHT DATA SCIENCE CAPABILITY

15
DATA SCIENCE TEAM?

The Right People


Motive here should be to select a
team of individuals having full set
of skills rather than individual
having all skills.

16
Data Science Team Key Roles

Manager/Lead
Data Scientist

Chief Data Data


Officer Scientist

Data
Science
Team
Data Data
Engineer Architect

Data
Analyst

17
Team within Organizations
Who Is Data Scientist?
Math,
Statistics, AI,
Competenties:
Machine • Team Player
learning • Ready to face failure
• Communicative
• Ready to get out of
comfort zone
Data • Analytical mindset
Scientist • Curiosity
Domain Computer
Knowledge, BI Science
• Empathy
tools (Programming, • Proactive
(Visualization) Databases)

19
Major Roles & Skills
Data Data Data
Engineer Analyst Scientist

Operation Visualization/
Developing
Interpretation

Developing Machine
Learning Machine
Learning
Infrastructure Math &
Design
Statistics Math &
Statistics

Operation Developing Infrastructure Design

Visualization Machine Math & Statistics


Learning 20
Data Science Key Roles

• Infrastructure
Data Analyst • Statistics,
• Database (SQL and • Machine learning,
• Statistics
NoSQL) • Programming
• Business Intelligence
• Feature engineering languages
• Reporting

Data Engineer Data Scientist

21
Skill Set Data Scientist

• Deep learning

• Basic statistics
Average • Unstructured data analysis
(text analysis. Video
• Basic Machine learning analytics)
(Regression and Decision • Machine learning (ANN, SVM, • Understanding of Hadoop
Trees) PCA, Naïve bayes, k-means, based Ecosystems
• Feature engineering KNN, etc.) (Hortonworks and Cloudera)
• Excel, Basic SQL • Programming (R, Python) • NoSQL
• Basic R and Python • SQL • Hive, Pig, SparkSQL
• BI tools - Visualization • Domain Knowledge and
• Domain Knowledge and communication skills
communication skills
Satisfactory
Advanced

22
23
What are the right skills for Data Scientist & Data Engineer?

Data Scientist & Data Engineer


24
What are the right skills for Data Scientist and Data Engineer?

Visualization System
Implementation
Programming
Story Telling
DB
Math Administration
Data Scientist Data Engineer
Statistics

Modeling Data Storage

25
Processes

People Science
Data

Data Analytics

Technology Processes Business

An interdisciplinary field that employs sophisticated tools and techniques to extract knowledge
and actionable insights from structured or unstructured data in order to optimize business
objectives.

26
Design Thinking Process
• Share your prototyped • Create a point of view that
idea with your original Testing Problem is based on user needs and
• What does the Definition
user for feedback. insights.
user think about 1 • What do you
What words? your solution? want to solve? • What are their needs?
• What didn't?
5

• Learn about the audience for


whom you are designing, by
2 observation and interview.
Empathy Building • who is my user?
4 •
Prototyping • For whom do What matters to this person?
• What exactly is you want to
• Build a representation of the solution? solve which
one or more of your ideas problem?
to show to others.
• How can I show my idea? 3
• Remember: A prototype is Ideation • Brainstorm and come up
just a rough draft!. • Which Ideas
could solve the
with as many creative
problem? solutions as possible.
• Wild ideas encouraged
27
Data Science Process

Expose Discover

Collect

Analyze

Process

Integrate
Store
28
Data Science
Internal data Combine
and Enrich
Correct data
faulty and features
missing
data
React
Dynamic quickly to
alerts events
External data

Machine Fast Proactive


learning visualization Monitoring

Data Data Advanced Visualization/ Decision


Data sources
cleaning enrichment analytics dashboards makers

29
CRISP-DM

Business Data
Understanding Understanding

Data Strategy
CRISP-DM (CROSS INDUSTRY Deployment
Data

STANDARD PROCESSING FOR Data Preparation


Data Science
DATA MINING)
Data Engineering
Evaluation Modelling

30
DATA SCIENCE & DESIGN THINKING LINKAGE
Step 1: Define hypothesis to test or prediction to be Empathize
made • Individual & small group interviews
• Seek to understand; Non-
judgmental
Step 2: Gather data.. And more data (Data Lake: SQL Define
+ Hadoop) • Personas (Objectives, decisions,
challenges)
• Envisioning variables that might be
Step 3: Prepare data: Build scheme (schema-on-
better predictors of performance
query)
Ideate
• Data visualization
Step 4: Visualize the data (Tableau, Micro strategy, • Descriptive analytics
Spotfire, ggplot2,..) • Illustrative analytics
Prototype
• Predictive Analytics
Step 5: Build analytic models (Python, R, Mahout) • Prescriptive analytics

Test
Step 6: Evaluate model “Goodness of fit” • Goodness of fit
• Codify impediments
(coefficients, confidence level) • Fail fast / learn faster / iterate

31
Data Science Process

SCOPING DATA WRANGLING


Define Receive data
Analysis Goals
Understand data
Understand business
objectives and context
Clean and prepare data
Iterate Data
Science
Process
Deploy and scale Explore data
into production
Model and execute analytic
workflows
Evaluate results and
VALUE GENERATION benefits Interpret results
AND BENEFITS DATA ANALYSIS

32
Developing Data Science With Strategic Capability Guidelines
Analytics Capability Adoption Curve
Degree of Support

OWNERSHIP

ADOPTION

ACCEPTANCE

UNDERSTANDING

AWARENESS

33
ENABLE ANALYTICAL STRATEGY
Insights Foresights Optimize

Optimization
What best we •Prescription of
can do? best choice
Predictive amongst a
Modeling complex web
•Modeling of options
targeted to
Descriptive enable
Modeling decisions
•Describe
Insights/ historical What will
event
Limited What-if
•Insights in
happen?
•Multi
inference &
OLAP Reporting dimensional
causality
querying
•Drill-thru •Basic scenario
•Drill-Across analysis
Standard
Reporting
•Comp Sales What happened?
•Sell-thru
Raw Data
•Product,
Sales,
Inventory,
Customer

Decision Support Decision Guidance


34
What is Agile Methodology?
Definition of Agile:
• Characterized by quickness, lightness, and ease of movement; nimble
• Mentally quick or alert: an agile mind.

Agile Methodology Promotes:


• Project management process that encourages frequent inspection and adaptation.
• Leadership philosophy that encourages team work, self-organization and accountability.
• Set of engineering best practices that allow for rapid delivery of high-quality software.
• Business approach that aligns development with customer needs and company goals.

35
What is Agile Methodology?
These determines KPIs, reports and other Mapping processes to
metrics, per user, in order to monitor that determine the starting
the process is working as agree point and the “Actual
State”

Weekly delivery of developments for


the end user to operate and request
for changes. It is IDEAL that the end
user is part of the process of
development and testing. Optimize a process with
a right combination of
technologies

We choose technologies and review


options with the client. The client is
part of the project from the very
beginning and feedback is VITAL 36
Agile Scrum Methodology

37
Agile Methodology
Data Understanding Data Preparation Modeling Evaluation Deployment

• Collect Data • Select Data • Select Modeling • Evaluate Results • Plan Deployment
• Describe Data • Clean Data Technique • Review Process • Plan Monitoring Next
• Explore Data • Integrate Data • Generate Test • Determine Next Steps • Review Project iteration
• Verify Data Quality • Format Data Design • Evaluate results with • Results/ Come Outs
• Build Model domain expert
• Asses Model

Data Understanding Data Preparation Modeling Evaluation Deployment

• Collect Data • Select Data • Select Modeling • Evaluate Results • Plan Deployment
• Describe Data • Clean Data Technique • Review Process • Plan Monitoring
• • • • •
Next
Explore Data Integrate Data Generate Test Determine Next Steps Review Project
• Verify Data Quality • Format Data Design • Evaluate results with • Results / Come Outs
iteration
• Build Model domain experts
• Asses Model

Data Understanding Data Preparation Modeling Evaluation Deployment

• Collect Data • Select Data • Select Modeling • Evaluate Results • Plan Deployment
• Describe Data • Clean Data Technique • Review Process • Plan Monitoring
• Explore Data • Integrate Data • Generate Test • Determine Next Steps • Review Project
• Verify Data Quality • Format Data Design • Evaluate results with • Results Come outs
• Build Model domain experts
• Asses Model
38
39
Emerging Technologies
Top Emerging Technologies

41
In this Class

Machine
1 Artificial
intelligence
2 Learning and
Deep Learning
Natural
3 Language 4 Computer Vision
Processing

5 Drones 6 Cloud Computing

7 Internet of Things 8 Edge Computing

9 Smart Cities

42
Statistics About Emerging Technologies

The Internet of
Artificial Quantum
Things & Smart Cybersecurity
Intelligence Computing
Cities
• $1.5 Trillion • 80% to 90% • $39.2 million • 4.5 bilion
potential market worlds data is potential market records
• 50 billion unstructured in 2017 breaches in first
devices • $2.2 billion in half of 2018
• $20 billion by 2025 • Hackers attacks
2050 on sensors computer in
alone every 39
seconds.

43
Artificial
Intelligence

05/18/2025 44
Artificial Intelligence

• Intelligence: “The Capacity to learn and solve problems”


• Artificial Intelligence: Artificial intelligence (AI) is the simulation of human intelligence by
machines.
• The ability to solve problems
• The ability to act rationally
• The ability to act like humans
05/18/2025 45
Artificial Intelligence Timeline

The word ‘Artificial ‘Shakey’ was the first Supercomputer ‘deep First commercially Speech recognition,
Intelligence’ coined by general purpose blue’ was designed which successful robotic Video analytics,
John Mccarthy mobile robot built defeated the world chess vacuum cleaner Industry robots, smart
champion in a game created homes and many
more….

05/18/2025 46
Artificial Intelligence Future

05/18/2025 47
Current Status of Artificial Intelligence

• Solving a variety of • Robots have become • Algorithmic Trading


problems of patients, very common in many • Market analysis &
hospital & healthcare industries data mining
industry overall. • Can do repetitive • Personal Finance
laborious tasks • Portfolio management
05/18/2025 48
Artificial Intelligence
Natural Language
Processing Nearest Neighbor

Naïve Bayes

classifiers
Decision Tree
Robotics
Supervised
learners Predictors Regression Trees

Artificial Machine Learning Model Trees


Intelligence Dual use
Neural Networks
Knowledge-Based Unsupervised Support Vector
Systems Pattern Detection
Learners Machine

Associative Learners
Expert Systems Clustering

K Means

Deterministic rules
& Processes &
Decisions
An Expert System
• A huge organized set of knowledge about a particular
Knowledg subject. It contains facts and judgmental knowledge
e Base which gives it the ability guess like human.

• A Set of rules on which to make decision (using the if-


Inference then structure). The Inference engine does reasoning
Engine by manipulating the knowledge base.

User • The user interface presents questions and information


Interface to the operator and also receives answers from the
operator
05/18/2025 50
Expert Systems Explanation

Knowledge from
an expert
Non-Expert
user Expert
System
Query
User Interface Inference Knowledge
Engine Base

Advic
e

05/18/2025 51
Tools For Artificial Intelligence

Search & Probabilistic Neural


Logic
optimization Methods Networks

05/18/2025 52
Advantages & Disadvantages of A.I

Advantages Disadvantages
The chances of error are almost nil High Cost
It can be used to explore space, depths Decrease in demand for human labor
of ocean
Smartphones are greatest example of AI AI may be programmed to do something
devasting
It can be used in time consuming tasks Machine Ethics
efficiently
Algorithms can help the doctors asses The storages and success are not as
patients and their health risks. effective as human brains
Machines do not require sleep or break No improvements with experience
and are able to function without
stopping.

05/18/2025 53
How to choose technology to implement data science in any organization?

05/18/2025 07:38 AM 54
How to choose technology to implement data science in any organization?

Does the selected


Does your team has skills Look through their
Know your requirements Depends on your budged technology fulfill your
to use that technology? website & services
requirements?

Examine their domain Read reviews on Look through their Learn their workflow and
Study their experiences
expertise specialized sites development activities best methodologies

05/18/2025 07:38 AM 55
Machine
Learning &
Deep
Learning
05/18/2025 56
Types of Learning
Supervised: Learning with a labeled training set
Example: email classification with already labeled emails

Unsupervised: Discover patterns in unlabeled data


Example: cluster similar documents based on text

Reinforcement learning: Learn to act based on feedback/reward


Example: learn to play Go, reward: win or lose

05/18/2025 07:38 AM 57
Machine Learning
Machine Learning is a field of computer science that gives computers the
ability to learn without being explicitly programmed
Machine Learning vs Traditional Programming

Data
Computer Output
Program
Traditional Programming

Data
Computer Program
Output
Machine Learning Approach
Machine Learning - Workflow
• Pandas
• Emoji translator • EDA
• Correlation
• Stop words • Deriving new
• Features
Removal features from
Selections
• Lemmatization available
• Sklearn
• Steaming attributes
• etc
Data Feature
Feature Selection
Preprocessing Engineering

Model Building Model Evaluation

• Kmeans, ANN, • F1- Score


Naivebayes etc. • Classification
(algo) Report
• Sklearn (lib) • Accuracy Matrixs
05/18/2025 07:38 AM • Tensorflow (lib) 60
• Sklearn(lib)
Why Machine Learning? (Importance of Machine Learning)

• Flood of available data (Especially with the advent of the internet)


• Increasing computational power
• Growing progress in available algorithms and theory developed by
researchers
• Increasing support from industries.
05/18/2025 61
Applications of Machine Learning

• Banking/Personal finance • Commuting


o Mobile check deposits o Google’s AI-Powered Predictions
o Fraud Prevention o Ride sharing apps like Uber & Lyft
o Credit Decisions • Email
• Social Networking o Spam Filters
o Facebook o Smart Email Categorization
o Pinterest • Grading & Assessment
o Instagram o Plagiarism Checkers
o Snap Chat o Robo-Readers
• Online Shopping
o Search
o Recommendations
o Fraud protection
05/18/2025 62

You might also like