Lecture 1 What Is Data Science Prerequisites, Lifecycle and Applications Simplilearn
Lecture 1 What Is Data Science Prerequisites, Lifecycle and Applications Simplilearn
1. Capture: Data Acquisition, Data Entry, Signal Reception, Data Extraction. This stage involves gathering raw
structured and unstructured data.
2. Maintain: Data Warehousing, Data Cleansing, Data Staging, Data Processing, Data Architecture. This stage
covers taking the raw data and putting it in a form that can be used.
3. Process: Data Mining, Clustering/Classification, Data Modeling, Data Summarization. Data scientists take the
prepared data and examine its patterns, ranges, and biases to determine how useful it will be in predictive
analysis.
4. Analyze: Exploratory/Confirmatory, Predictive Analysis, Regression, Text Mining, Qualitative Analysis. Here is
the real meat of the lifecycle. This stage involves performing the various analyses on the data.
5. Communicate: Data Reporting, Data Visualization, Business Intelligence, Decision Making. In this final step,
analysts prepare the analyses in easily readable forms such as charts, graphs, and reports.
Business Managers: The business managers are the people in charge of overseeing the data science training
method. Their primary responsibility is to collaborate with the data science team to characterize the problem and
establish an analytical method.
IT Managers: Data science teams are constantly monitored and resourced accordingly to ensure that they operate
efficiently and safely. They may also be in charge of creating and maintaining IT environments for data science
teams.
Data Science Managers: The data science managers make up the final section of the tea. They primarily trace and
supervise the working procedures of all data science team members. They also manage and keep track of the day-
to-day activities of the three data science teams. They are team builders who can blend project planning and
monitoring with team growth.
Data scientists are among the most recent analytical data professionals who have the technical ability to handle complicated
issues as well as the desire to investigate what questions need to be answered.
Before tackling the data collection and analysis, the data scientist determines the problem by asking the right
questions and gaining understanding.
The data scientist then determines the correct set of variables and data sets.
The data scientist gathers structured and unstructured data from many disparate sources—enterprise data,
public data, etc.
Once the data is collected, the data scientist processes the raw data and converts it into a format suitable for
analysis. This involves cleaning and validating the data to guarantee uniformity, completeness, and accuracy.
After the data has been rendered into a usable form, it’s fed into the analytic system—ML algorithm or a
statistical model. This is where the data scientists analyze and identify patterns and trends.
When the data has been completely rendered, the data scientist interprets the data to find opportunities and solutions.
The data scientists finish the task by preparing the results and insights to share with the appropriate stakeholders
and communicating the results.
1. Data science may detect patterns in seemingly unstructured or unconnected data, allowing conclusions and
predictions to be made.
2. Tech businesses that acquire user data can utilise strategies to transform that data into valuable or profitable
information.
3. Data Science has also made inroads into the transportation industry, such as with driverless cars.
4. Data Science applications provide a better level of therapeutic customization through genetics and genomics
research.
Job role: Determine what the problem is, what questions need answers, and where to find the data. Also, they
mine, clean, and present the relevant data.
Skills needed: Programming skills (SAS, R, Python), storytelling and data visualization, statistical and
mathematical skills, knowledge of Hadoop, SQL, and Machine Learning.
Data Analyst
Job role: Analysts bridge the gap between the data scientists and the business analysts, organizing and
analyzing data to answer the questions the organization poses. They take the technical analyses and turn them
into qualitative action items.
Skills needed: Statistical and mathematical skills, programming skills (SAS, R, Python), plus experience in
data wrangling and data visualization.
Data Engineer
Job role: Data engineers focus on developing, deploying, managing, and optimizing the organization’s data
infrastructure and data pipelines. Engineers support data scientists by helping to transfer and transform data for
queries.
Skills needed: NoSQL databases (e.g., MongoDB, Cassandra DB), programming languages such as Java and
Scala, and frameworks (Apache Hadoop).
Data Science Tools
Analytical in nature - provides a historical report of the Scientific in nature - perform an in-depth statistical analysis on the
data data
Use of basic statistics with emphasis on visualization Leverages more sophisticated statistical and predictive analysis
(dashboards, reports) and machine learning (ML)
Compares historical data to current data to identify Combines historical and current data to predict future performance
trends and outcomes
Law Enforcement: In this scenario, data science is used to help police in Belgium to better understand where
and when to deploy personnel to prevent crime.
Pandemic Fighting: The state of Rhode Island wanted to reopen schools, but was naturally cautious, considering
the ongoing COVID-19 pandemic. The state used data science to expedite case investigations and contact tracing,
enabling a small staff to handle an overwhelming number of concerned calls from citizens. This information
helped the state set up a call center and coordinate preventative measures.
Driverless Vehicles: Lunewave, a sensor manufacturing company, was looking for a way to make sensor
technology more cost-effective and accurate. They turned to data science and machine learning to train their
sensors to be safer and more reliable, as well as using data to improve their 3D-printed sensor manufacturing
process.
Entertainment: Data science enables streaming services to follow and evaluate what consumers view, which aids
in the creation of new TV series and films.
Finance: Banks and credit card firms mine and analyse data in order to detect fraudulent activities, manage financial
risks on loans and credit lines, and assess client portfolios in order to uncover upselling possibilities.
Manufacturing: Data science applications in manufacturing include supply chain management and distribution
optimization, as well as predictive maintenance to anticipate probable equipment faults in facilities before they
occur.
Healthcare: Machine learning models and other data science components are used by hospitals and other healthcare
providers to automate X-ray analysis and assist doctors in diagnosing illnesses and planning treatments based on
previous patient outcomes.
Retail: Retailers evaluate client behaviour and purchasing trends in order to provide individualized product
suggestions as well as targeted advertising, marketing, and promotions. Data science also assists them in
managing product inventories and supply chains in order to keep items in stock.