Chapter 1 - Intro To Data Science
Chapter 1 - Intro To Data Science
Learning objective
After studying this chapter, you should be able to:
• Define each of the following key terms: Data Science
• Discuss basic characteristics of the Data Science Projects
• Discuss the steps in the Data Science Project
• The first is you want to know if the data that you have is suitable for answering the question
that you have.
• Is there enough data?
• Are there too many missing values?
• Am I missing certain variables or do I need to collect more data to get those
variables, etc?
• The second goal of exploratory data analysis is to start to develop a sketch of the solution.
• Apply your statistical, mathematical and technological knowledge.
• The formal modeling phase is the way to specifically write down what questions you’re
asking and what parameters you’re trying to estimate.
• Challenging your model and developing a formal framework is really important to making
sure that you can develop robust evidence for answering your question.
Step 6: Interpretation
• You’ve probably done many different analyses, you probably fit many different models. And
so you have many different bits of information to think about.
• Part of the challenge of the interpretation phase is to assemble all of the information and
weigh each of the different pieces of evidence.
• You know which pieces are more reliable, which are are more uncertain than others, and
which more important than others to get a sense of the totality of evidence with respect to
answering the question.
Step 7 : Communication
• Any data science project that is successful will want to communicate its findings to some
sort of audience.
• That audience may be internal to your organization, it may be external, it may be to a large
audience or even just a few people.
• “Ok Google, Siri, Cortana", etc., and these devices respond as per voice control which
uses speech recognition algorithm.
Transport:
• Transport industries also using data science technology to create self-driving cars.
Healthcare:
• Data science is being used for tumor detection, drug discovery, medical image analysis,
virtual medical bots, etc.
Risk Detection:
• Most of the finance companies/ banks are using data science to avoid risk and any type of
losses with an increase in customer satisfaction.
Crime Analysis:
• Data Analytics can be used for crime analysis based on the area of frequent crime,
historical pattern (predictive Policing) also used to predict civil unrest in cities based on
social media posts.
Churn Prediction:
• Predicting which customers will churn will help a company to retain their existing customers
by analyzing historical transactions
1. https://datafloq.com/read/data-science-8-powerful-applications/7090
2. https://www.javatpoint.com/data-science
3. https://makemeanalyst.com/structure-data-science-project-different-phases-data-science-
project/#:~:text=Structure%20of%20a%20Data%20Science%20Project%20%7C%20Differe
nt,4%20Phase%204%3A%20Interpretation%205%20Phase%205%3A%20Communication