0% found this document useful (0 votes)
62 views5 pages

Lecture 1 What Is Data Science Prerequisites, Lifecycle and Applications Simplilearn

Uploaded by

anavlamba94
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views5 pages

Lecture 1 What Is Data Science Prerequisites, Lifecycle and Applications Simplilearn

Uploaded by

anavlamba94
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

What is Data Science: Lifecycle, Applications, Prerequisites and Tools

What Is Data Science?


 Data science is the domain of study that deals with vast volumes of data using modern tools and techniques to
find unseen patterns, derive meaningful information, and make business decisions.
 Data science uses complex machine learning algorithms to build predictive models.
 The data used for analysis can come from many different sources and presented in various formats.

The Data Science Lifecycle


Data science’s lifecycle consists of five distinct stages, each with its own tasks:

1. Capture: Data Acquisition, Data Entry, Signal Reception, Data Extraction. This stage involves gathering raw
structured and unstructured data.
2. Maintain: Data Warehousing, Data Cleansing, Data Staging, Data Processing, Data Architecture. This stage
covers taking the raw data and putting it in a form that can be used.
3. Process: Data Mining, Clustering/Classification, Data Modeling, Data Summarization. Data scientists take the
prepared data and examine its patterns, ranges, and biases to determine how useful it will be in predictive
analysis.
4. Analyze: Exploratory/Confirmatory, Predictive Analysis, Regression, Text Mining, Qualitative Analysis. Here is
the real meat of the lifecycle. This stage involves performing the various analyses on the data.
5. Communicate: Data Reporting, Data Visualization, Business Intelligence, Decision Making. In this final step,
analysts prepare the analyses in easily readable forms such as charts, graphs, and reports.

Prerequisites for Data Science


Machine Learning, Modeling, Statistics, , Programming (i.e. python, R. Python etc.), Databases

Who Oversees the Data Science Process?

 Business Managers: The business managers are the people in charge of overseeing the data science training
method. Their primary responsibility is to collaborate with the data science team to characterize the problem and
establish an analytical method.
 IT Managers: Data science teams are constantly monitored and resourced accordingly to ensure that they operate
efficiently and safely. They may also be in charge of creating and maintaining IT environments for data science
teams.
 Data Science Managers: The data science managers make up the final section of the tea. They primarily trace and
supervise the working procedures of all data science team members. They also manage and keep track of the day-
to-day activities of the three data science teams. They are team builders who can blend project planning and
monitoring with team growth.

Data scientists are among the most recent analytical data professionals who have the technical ability to handle complicated
issues as well as the desire to investigate what questions need to be answered.

On a daily basis, a data scientist may do the following tasks:

1. Discover patterns and trends in datasets to get insights.


2. Create forecasting algorithms and data models.
3. Improve the quality of data or product offerings by utilising machine learning techniques.
4. Distribute suggestions to other teams and top management.
5. In data analysis, use data tools such as R, SAS, Python, or SQL.
6. Top the field of data science innovations.

What Does a Data Scientist Do?


A data scientist analyzes business data to extract meaningful insights. In other words, a data scientist solves business
problems through a series of steps, including:

 Before tackling the data collection and analysis, the data scientist determines the problem by asking the right
questions and gaining understanding.
 The data scientist then determines the correct set of variables and data sets.
 The data scientist gathers structured and unstructured data from many disparate sources—enterprise data,
public data, etc.
 Once the data is collected, the data scientist processes the raw data and converts it into a format suitable for
analysis. This involves cleaning and validating the data to guarantee uniformity, completeness, and accuracy.
 After the data has been rendered into a usable form, it’s fed into the analytic system—ML algorithm or a
statistical model. This is where the data scientists analyze and identify patterns and trends.
 When the data has been completely rendered, the data scientist interprets the data to find opportunities and solutions.
 The data scientists finish the task by preparing the results and insights to share with the appropriate stakeholders
and communicating the results.

Why Become a Data Scientist?


Furthermore, the profession of data scientist came in second place in the Best Jobs in America for 2021 survey, with an
average base salary of USD 127,500.
Use of Data Science

1. Data science may detect patterns in seemingly unstructured or unconnected data, allowing conclusions and
predictions to be made.
2. Tech businesses that acquire user data can utilise strategies to transform that data into valuable or profitable
information.
3. Data Science has also made inroads into the transportation industry, such as with driverless cars.
4. Data Science applications provide a better level of therapeutic customization through genetics and genomics
research.

Where Do You Fit in Data Science?


Data science offers you the opportunity to focus on and specialize in one aspect of the field.
Data Scientist

 Job role: Determine what the problem is, what questions need answers, and where to find the data. Also, they
mine, clean, and present the relevant data.
 Skills needed: Programming skills (SAS, R, Python), storytelling and data visualization, statistical and
mathematical skills, knowledge of Hadoop, SQL, and Machine Learning.

Data Analyst

 Job role: Analysts bridge the gap between the data scientists and the business analysts, organizing and
analyzing data to answer the questions the organization poses. They take the technical analyses and turn them
into qualitative action items.
 Skills needed: Statistical and mathematical skills, programming skills (SAS, R, Python), plus experience in
data wrangling and data visualization.

Data Engineer

 Job role: Data engineers focus on developing, deploying, managing, and optimizing the organization’s data
infrastructure and data pipelines. Engineers support data scientists by helping to transfer and transform data for
queries.
 Skills needed: NoSQL databases (e.g., MongoDB, Cassandra DB), programming languages such as Java and
Scala, and frameworks (Apache Hadoop).
Data Science Tools

 Data Analysis: SAS, Jupyter, R Studio, MATLAB, Excel, RapidMiner


 Data Warehousing: Informatica/ Talend, AWS Redshift
 Data Visualization: Jupyter, Tableau, Cognos, RAW
 Machine Learning: Spark MLib, Mahout, Azure ML studio

Difference Between Business Intelligence and Data Science

Business Intelligence Data Science

Uses structured data Uses both structured and unstructured data

Analytical in nature - provides a historical report of the Scientific in nature - perform an in-depth statistical analysis on the
data data

Use of basic statistics with emphasis on visualization Leverages more sophisticated statistical and predictive analysis
(dashboards, reports) and machine learning (ML)

Compares historical data to current data to identify Combines historical and current data to predict future performance
trends and outcomes

Applications of Data Science


Data science has found its applications in almost every industry.
1. Healthcare: Healthcare companies are using data science to build sophisticated medical instruments to detect and cure
diseases.
2. Gaming: Video and computer games are now being created with the help of data science and that has taken the gaming
experience to the next level.
3. Image Recognition: Identifying patterns in images and detecting objects in an image is one of the most popular data
science applications.
4. Recommendation Systems: Netflix and Amazon give movie and product recommendations based on what you like to
watch, purchase, or browse on their platforms.
5. Logistics: Data Science is used by logistics companies to optimize routes to ensure faster delivery of products and
increase operational efficiency.
6. Fraud Detection: Banking and financial institutions use data science and related algorithms to detect fraudulent
transactions.
7. Internet Search: When we think of search, we immediately think of Google. Right?
8. Speech recognition: Speech recognition is dominated by data science techniques. We may see the excellent work of
these algorithms in our daily lives. Have you ever needed the help of a virtual speech assistant like Google Assistant, Alexa,
or Siri?
9. Targeted Advertising: If you thought Search was the most essential data science use, consider this: the whole digital
marketing spectrum. From display banners on various websites to digital billboards at airports, data science algorithms are
utilised to identify almost anything.
10. Airline Route Planning: As a result of data science, it is easier to predict flight delays for the airline industry, which is
helping it grow.
11. Augmented Reality: Last but not least, the final data science applications appear to be the most fascinating in the future.

Example of Data Science


Here are some brief overviews of a couple of use cases, showing data science’s versatility.

 Law Enforcement: In this scenario, data science is used to help police in Belgium to better understand where
and when to deploy personnel to prevent crime.
 Pandemic Fighting: The state of Rhode Island wanted to reopen schools, but was naturally cautious, considering
the ongoing COVID-19 pandemic. The state used data science to expedite case investigations and contact tracing,
enabling a small staff to handle an overwhelming number of concerned calls from citizens. This information
helped the state set up a call center and coordinate preventative measures.
 Driverless Vehicles: Lunewave, a sensor manufacturing company, was looking for a way to make sensor
technology more cost-effective and accurate. They turned to data science and machine learning to train their
sensors to be safer and more reliable, as well as using data to improve their 3D-printed sensor manufacturing
process.
 Entertainment: Data science enables streaming services to follow and evaluate what consumers view, which aids
in the creation of new TV series and films.
 Finance: Banks and credit card firms mine and analyse data in order to detect fraudulent activities, manage financial
risks on loans and credit lines, and assess client portfolios in order to uncover upselling possibilities.
 Manufacturing: Data science applications in manufacturing include supply chain management and distribution
optimization, as well as predictive maintenance to anticipate probable equipment faults in facilities before they
occur.
 Healthcare: Machine learning models and other data science components are used by hospitals and other healthcare
providers to automate X-ray analysis and assist doctors in diagnosing illnesses and planning treatments based on
previous patient outcomes.
 Retail: Retailers evaluate client behaviour and purchasing trends in order to provide individualized product
suggestions as well as targeted advertising, marketing, and promotions. Data science also assists them in
managing product inventories and supply chains in order to keep items in stock.

You might also like