0% found this document useful (0 votes)
2 views

1. Data Science Introduction

The document provides an overview of data science, highlighting its importance in decision-making, efficiency, and innovation across various sectors such as business, healthcare, and policymaking. It outlines the data science process, which includes capturing, maintaining, processing, analyzing, and communicating data, as well as the distinction between data science and business intelligence. Additionally, it discusses essential tools, skills, and prerequisites for data scientists, emphasizing the multidisciplinary nature of the field.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

1. Data Science Introduction

The document provides an overview of data science, highlighting its importance in decision-making, efficiency, and innovation across various sectors such as business, healthcare, and policymaking. It outlines the data science process, which includes capturing, maintaining, processing, analyzing, and communicating data, as well as the distinction between data science and business intelligence. Additionally, it discusses essential tools, skills, and prerequisites for data scientists, emphasizing the multidisciplinary nature of the field.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

1.

Data Science Introduction


• Importance of Data Science
• Need for Data Science
• Data Science Process
• Business Intelligence and Data Science
• Prerequisite for Data Scientist
• Components of Data Science
• Tools and Skills.
Data Science
• Data science is all about extracting insights from complex
information with the use of programming and other
techniques.
• It is a multidisciplinary approach that combines principles
and practices from the fields of mathematics, statistics,
artificial intelligence, and computer engineering to analyze
large amounts of data and determine patterns and trends
through formats like data visualizations and predictive
models.
• With these insights, organizations can better understand why
certain events happen and develop more informed decision-
making processes or planning.
• Inaddition, Ability to take proactive measures, design more
Importance of Data Science
• Data science is important because it helps organizations make
better decisions, solve problems, improve their operations and
discover new advancements.
• Informed decision-making: Data science helps organizations
make better decisions by providing reliable, data-driven
insights. For example, a manufacturing company can use data
science to identify areas where it can improve its production
process.
• Improved customer experience: Data science helps
businesses understand their customers better, which can lead to
improved customer interactions and more tailored marketing.
• Better products: Data science can help companies develop
better products by identifying what their target audiences
enjoy.
Importance of Data Science
• Increased efficiency: Data science can help organizations
identify areas for improvement and increase efficiency.
• Predicting outcomes: Data science can help organizations
predict trends and outcomes.
• Measuring performance: Data science can help
organizations measure the effectiveness of their strategies.
• Mitigating risk: Data science can help organizations mitigate
risk and fraud.
• Innovation: Data science can help fuel innovation and drive
change in communities.
Need of Data Science
• Data science combines tools, methods, and technology to
generate meaning from data.
• It's used in many Business, Healthcare, Policymaking and
Innovation.
Business
• Data science can help businesses understand their customers,
improve customer interactions, and tailor their marketing. It
can also help businesses make better decisions, measure
performance, and increase efficiency.
Need of Data Science
Healthcare
• Data science can lead to improved patient care, disease
prevention, and optimized treatment plans.
Policymaking
Data science can help policymakers and organizations take
proactive measures and develop sustainable policies.
For example, data scientists can analyze data from social media,
sensors, and public records to identify patterns and trends related
to climate change.
Innovation
Data science can fuel innovation and drive advancements in
various fields, including finance, transportation, and
cybersecurity.
Data Science Process
• Data science is typically thought of as a five-step process, or
lifecycle:
1. Capture
• This stage is when data scientists gather raw and unstructured
data. The capture stage typically includes data acquisition,
data entry, signal reception and data extraction.
2. Maintain
• This stage is when data is put into a form that can be utilized.
The maintenance stage includes data warehousing, data
cleansing, data staging, data processing and data architecture.
Data Science Process
3. Process
• This stage is when data is examined for patterns and biases to
see how it will work as a predictive analysis tool. The process
stage includes data mining, clustering and classification, data
modeling and data summarization.
4. Analyze
• This stage is when multiple types of analyses are performed on
the data. The analysis stage involves data reporting, data
visualization, business intelligence and decision making.
Data Science Process
5. Communicate
• This stage is when data scientists and analysts showcase the
data through reports, charts and graphs. The communication
stage typically includes exploratory and confirmatory analysis,
predictive analysis, regression, text mining and qualitative
analysis.
Business Intelligence and Data Science
• Business intelligence (BI) and data science are both essential
for making data-driven decisions.
• Data science is the study of finding patterns and forecasts
through sophisticated analytics, machine learning, and
algorithms.
• In contrast, the main function of business intelligence is to
provide historical data so that companies can make well-
informed operational decisions.
Business Intelligence and Data Science
S. Factor Data Science Business Intelligence
No.

It is a field that uses mathematics, It is basically a set of technologies,


statistics and various other tools to applications and processes that are
1. Concept
discover the hidden patterns in the used by the enterprises for business
data. data analysis.
2. Focus It focuses on the future. It focuses on the past and present.
It deals with both structured as well as It mainly deals only with structured
3. Data
unstructured data. data.
Data science is much more flexible as It is less flexible as in case of
4. Flexibility data sources can be added as per business intelligence data sources
requirement. need to be pre-planned.
5. Method It makes use of the scientific method. It makes use of the analytic method.
It has a higher complexity in It is much simpler when compared to
6. Complexity
comparison to business intelligence. data science.
7. Expertise It’s expertise is data scientist. It’s expertise is the business user.
It deals with the questions of what will It deals with the question of what
8. Questions
happen and what if. happened.
Business Intelligence and Data Science
S. No. Factor Data Science Business Intelligence
The data to be used is disseminated in
9. Storage Data warehouse is utilized to hold data.
real-time clusters.

The ELT (Extract-Load-Transform)


The ETL (Extract-Transform-Load) process is
Integration of process is generally used for the
10. generally used for the integration of data for
data integration of data for data science
business intelligence applications.
applications.

It’s tools are InsightSquared Sales Analytics,


It’s tools are SAS, BigML, MATLAB,
11. Tools Klipfolio, ThoughtSpot, Cyfe, TIBCO
Excel, etc.
Spotfire, etc.

Companies can harness their potential by


Business Intelligence helps in performing root
anticipating the future scenario using data
12. Usage cause analysis on a failure or to understand the
science in order to reduce risk and
current status.
increase income.

Greater business value is achieved with Business Intelligence has lesser business value
Business data science in comparison to business as the extraction process of business value
13.
Value intelligence as it anticipates future carries out statically by plotting charts and
events. KPIs (Key Performance Indicator).

The technologies such as Hadoop are


Handling data available and others are evolving for The sufficient tools and technologies are not
14.
sets handling understandingItsItsarge data available for handling large data sets.
sets.
Prerequisite for Data Scientist
• Data science involves several components that work together
to extract insights and value from data.
Components of Data Science
• Statistics: Statistics is one of the most important components
of data science. Statistics is a way to collect and analyze
numerical data in a large amount and find meaningful insights
from it.
• Mathematics: Mathematics is a critical part of data science.
Mathematics involves the study of quantity, structure, space,
and changes. For a data scientist, knowledge of good
mathematics is essential.
• Domain Expertise: In data science, domain expertise binds
data science together. Domain expertise means specialized
knowledge or skills in a particular area. In data science, there
are various areas for which we need domain experts.
Components of Data Science
• Data Collection: Data is gathered and acquired from a
number of sources. This can be unstructured data from social
media, text, or photographs, as well as structured data from
databases.
• Data Preprocessing: Raw data is frequently unreliable,
erratic, or incomplete. In order to remove mistakes, handle
missing data, and standardize the data, data cleaning and
preprocessing is a crucial steps.
• Data Exploration and Visualization: This entails exploring
the data and gaining insights using methods like statistical
analysis and data visualization. To aid in understanding the
data, this may entail developing graphs, charts, and
dashboards.
Components of Data Science
• Data Modeling: In order to analyze the data and derive
insights, this component entails creating models and
algorithms. Regression, classification, and clustering are a few
examples of supervised and unsupervised learning techniques
that may be used in this.
• Machine Learning: Building predictive models that can learn
from data is required for this. This might include the
increasingly significant deep learning methods, such as neural
networks, in data science.
Components of Data Science
• Communication: This entails informing stakeholders of the
data analysis's findings. Explain the results, and this might
involve producing reports, visualizations, and presentations.
• Deployment and Maintenance: The models and algorithms
need to be deployed and maintained when the data science
project is over. This may entail keeping an eye on the models'
performance and upgrading them as necessary.
Data Science Tools & Skills
Data Science Tools:
• Data science professionals require knowledge of data science
tools and programming languages.
• Common data science programming languages include:
– Python: Python is an object-oriented, general-purpose
programming language known for having simple syntax and being
easy to use. It’s often used for executing data analysis, building
websites and software and automating various tasks.
– R: R is a programming language that caters to statistical computing
and graphics. It’s ideal for creating data visualizations and building
statistical software.
Data Science Tools & Skills
Data Science Tools:
Popular data science tools include:
• Apache Spark: Apache Spark is an open-source processing
engine that works well with popular languages like R and
Python. It’s used for handling big data workloads, so teams
can quickly complete analyses and queries on data sets of any
size.
• SQL: SQL is a domain-specific language that specializes in
storing and managing data in relational databases. It’s used to
communicate with relational databases, making it possible to
retrieve data, update data and perform other tasks.
Data Science Tools & Skills
Data Science Tools:
Popular data science tools include:
• Tableau: Tableau is a platform that generates data
visualizations and business insights to facilitate information
analysis and sharing. It’s used to share data in understandable
formats, so teams can make faster, data-driven decisions.
• Jupyter Notebook: Jupyter Notebook is a web application
that enables users to share documents containing data
visualizations, live code and other interactive elements. It’s
best used for encouraging collaboration on running
documents.
Data Science Tools & Skills
Data Science Tools:
Popular data science tools include:
• Apache Hadoop: Apache Hadoop is an open-source
framework that aids in processing and storing large data sets.
It’s popular for efficiently managing big data, so teams can use
data to assess financial risk, predict customer demand and
quickly locate health data, among other use cases.
• TensorFlow: TensorFlow is an open-source library of tools
and resources for building machine learning applications. It’s
used for training models, monitoring performance and
completing other tasks that natural language processing, image
recognition and other types of machine learning models
depend on.
Data Science Tools & Skills
Data Science Tools:
Popular data science tools include:
• PyTorch: PyTorch is an open-source machine learning
framework for developing deep learning models. It’s ideal for
building neural networks to power applications in areas like
computer vision, image recognition and natural language
processing.
Choosing a data science tool depends on a number of variables,
including the problem being solved, the needs of the business and
the skill level of the data scientists involved.
Data Science Tools & Skills
Data Science Skills:
• While there are some skills and techniques that data scientists
will need to learn to enter into more specialized fields within data
science — such as deep learning, neural networks and natural
language processing.
• Programming: Using languages like Python and R.
• Database Management: Learning and applying SQL to communicate
with databases.
• Statistics: Having a handle on how to analyze data to solve problems.
• Curiosity: Focused on figuring problems out and always learning new
things.
• Storytelling: The ability to tell stories with data and relay insights.
• Communication: Comfortable collaborating with others and
communicating problems and solutions clearly.
• mindluster.com -> 300 free course check it

You might also like