FDS-Unit II-ECE
FDS-Unit II-ECE
● Structured Data – Organized in a tabular format (e.g., relational databases, Excel sheets).
● Unstructured Data – Does not follow a fixed format (e.g., text, images, videos, emails).
● Primary Data – Collected firsthand for a specific purpose (e.g., surveys, experiments).
● Secondary Data – Pre-existing data collected by others (e.g., research papers, reports).
1. Types of Data Collection Methods
A. Primary Data Collection (First-Hand Data)
🔹 Databases & Repositories – Government, corporate, or public datasets (e.g., Kaggle, UCI).
🔹 APIs (Application Programming Interfaces) – Real-time access to online data (e.g., weather, social media).
🔹 Open Data Portals – Publicly available datasets (e.g., World Bank, WHO).
🔹 Research Papers & Reports – Academic and industry studies.
🔹 Logs & Transactions – System-generated records (e.g., server logs, purchase histories).
2. Data Collection Challenges
✔ Data Quality – Ensuring accuracy, consistency, and completeness.
✔ Ethical Considerations – Respecting privacy and security (e.g., GDPR, HIPAA).
✔ Volume & Scalability – Handling large datasets efficiently.
Key Characteristics:
Descriptive & Diagnostic – Focuses on "What happened?" and "Why did it happen?"
Exploratory Approach – Identifies patterns, relationships, and anomalies in data.
Key Characteristics:
Predictive & Prescriptive – Focuses on "What will happen?" and "How can we make it happen?"
Data-Driven Decision Making – Uses algorithms and AI to extract deeper insights.
Machine Learning & AI Techniques – Regression, clustering, deep learning.
Business Intelligence (BI) Applications – Dashboards, KPI tracking, forecasting models.
it involves:
1. Summarizing Data: It involves collecting historical data and presenting it in a readable and understandable
format, using measures like mean, median, mode, etc.
2. Data Visualization: Use of charts, graphs, histograms, and other visual aids to make data easily digestible.
3. Identifying Trends: Helps in identifying patterns and trends over time, such as sales trends, customer
behavior, and operational performance.
1. Identifying Anomalies: Look for outliers or unusual patterns that deviate from the norm.
2. Drill-Down Analysis: Dig deeper into data subsets to get more detailed insights and identify the underlying factors.
3. Correlation Analysis: Assess the relationships between different variables to understand how they influence one another.
4. Hypothesis Testing: Form and test hypotheses to determine the causes of specific outcomes.
● R and Python: These programming languages are popular for their powerful statistical and data analysis packages.
● SQL: Helpful for querying detailed data sets and performing complex joins.
● Tableau and Power BI: These tools allow you to drill down into visual data representations and explore the factors behind
the trends.
Prescriptive analytics is a type of data analytics that goes beyond
descriptive and diagnostic analytics. While descriptive analytics tells you what
happened and diagnostic analytics explains why it happened, prescriptive
analytics recommends actions you can take to achieve desired outcomes. It
leverages advanced techniques such as optimization algorithms, machine
learning, and simulation to provide actionable insights and guidance on the best
course of action.
Predictive analytics involves using historical data, machine learning algorithms, and statistical techniques
to predict future outcomes. It aims to forecast trends, behaviors, and events by analyzing patterns found in existing
data. This technique is widely applied across various industries, including finance, marketing, healthcare, and more.
Goal To provide actionable insights for To uncover patterns and trends within the
decision-making data
Tools Advanced tools and software Statistical tools and basic software
Application Business intelligence, marketing, finance, etc. Research, academic studies, operations