Introduction to Data Science Important Questions
Introduction to Data Science Important Questions
Data Science is an interdisciplinary field that uses scientific methods, algorithms, processes, and systems to
extract insights and knowledge from structured and unstructured data. It combines statistics, computer
1. Data Collection
2. Data Cleaning
4. Feature Engineering
5. Model Building
6. Model Evaluation
7. Deployment
- Structured Data: Organized in rows and columns (e.g., Excel sheets, SQL databases).
- Unstructured Data: Lacks a fixed structure (e.g., images, videos, emails, social media posts).
Machine Learning is a subset of AI that enables systems to learn and improve from experience without being
explicitly programmed. It uses algorithms to identify patterns and make decisions based on data.
Statistics helps in data collection, analysis, interpretation, and presentation. It is used to summarize data, find
Introduction to Data Science - Important Questions and Answers
patterns, and make informed decisions through methods like probability, hypothesis testing, and regression
analysis.
Data Wrangling (or Data Cleaning) is the process of cleaning and transforming raw data into a usable format.
It involves handling missing values, correcting data types, and removing duplicates.
EDA is the process of analyzing data sets to summarize their main characteristics. It helps in understanding
the data, detecting outliers, and choosing the right modeling techniques.
10. What is the difference between AI, Machine Learning, and Data Science?
- Data Science: Uses ML, statistics, and domain knowledge to extract insights from data.