Data Science Curriculum 2024
Data Science Curriculum 2024
in dataminds dataminds_in
dataminds
TM
CLASSROOM
ONLINE 3 MONTHS- Certificate Course
HYBRID MODEL
6 MONTHS - Diploma in Data Science
dataminds.in dataminds 100% PLACEMENT ASSISTANCE dataminds_in dataminds.in
MODULE 01
PYTHON PROGRAMMING dataminds
TM
MODULE 02
TRAINING | PLACEMENTS | CONSULTING
NUMPY PANDAS
Introduction to NumPy Introduction to Pandas
What is NumPy? What is Pandas?
History of NumPy History and evolution of Pandas
MODULE 03
TRAINING | PLACEMENTS | CONSULTING
MATPLOTLIB SEABORN
Overview of Matplotlib Overview of Matplotlib
Matplotlib Basics Seaborn Introduction
Installing Matplotlib Installing Seaborn
Basic plotting with Matplotlib Overview of Seaborn's capabilities
Line plots, scatter plots, and bar plots
MODULE 04
TRAINING | PLACEMENTS | CONSULTING
Handling Duplicates
Identifying and removing duplicate records
Strategies for handling duplicate values
Feature Engineering
Type castings Creating new features for better
Converting data types for compatibility and efficiency model performance
Addressing issues with incorrect data types Techniques such as encoding, scaling &
transformations
MODULE 05
DATA SCIENCE dataminds
TM
STATISTICS
Descriptive Statistics
Inferential Statistics:
Hypothesis Testing
Formulating a Hypothesis
Choosing Null and Alternative Hypotheses
Type I or Alpha Error and Type II or Beta Error
Confidence Level, Significance Level, Power of Test
Confidence Intervals
Confidence Interval - Concept
P-value:
MODULE 06
TRAINING | PLACEMENTS | CONSULTING
Math Fundamentals
Foundations of Machine Learning
Python programming basics
Linear algebra and calculus essentials
Introduction to Machine Learning
Overview of machine learning Model Selection
Setting up development environments
(Python, Jupyter, sklearn libraries) LIBRARIES
Supervised Learning
Definition of supervised learning
Explanation of the difference between supervised and unsupervised learning.
Regression Classification
Correlation Definition of classification.
Scatter Diagram Understanding the concept of class labels.
Correlation coefficient Binary and multiclass classification.
Correlation analysis
Correlation coefficient
Regression
Logistic regression
Linear Regression
Types of Logistic regression
Simple Regression Logit and Log-Likelihood
Linear Equation - coefficients,intercept Sigmoid function
Residuals,Least Squares Method: Analysis of logistic regression results
Assumptions of Linear Regression Multiple Logistic regression
Homoscedasticity.Heteroscedasticity Evaluation metrics
Multicollinearity Confustion matrix
Polynomial Regression AUC / ROC for binary classifier
MODULE 06
TRAINING | PLACEMENTS | CONSULTING
MODULE 06
TRAINING | PLACEMENTS | CONSULTING
Clustering
Distance Metrics
k-Means clustering
Natural Language Processing (NLP)
Hierarchical Clustering
Non-Hierarchical Clustering DBSCAN Tokenization and text processing
Clustering Evaluation metrics Introduction to language models
Text Mining and Natural Language
K-Means Clustering: Processing (NLP)
In-depth coverage of the K-means algorithm, Sources of data
its initialization methods, and convergence Bag of words
Practical implementation and examples. Pre-processing, corpus Document
Term Matrix (DTM) & TDM
Association Rules Word Clouds
Corpus-level word clouds
Assocation rules mining
Sentiment Analysis
Market Basket Analysis
Positive Word clouds
Apriori Algorithm,Fp Growth
Negative word clouds
Metrics - Support,Confidence,Lift
Unigram, Bigram, Trigram
Recommender Systems Semantic network
Extract, user reviews of the
User Based Collaborative Filtering
product/services from Amazon and
Similarity Metrics
tweets from Twitter
Item Based Collaborative Filtering
Install Libraries from Shell
Search Based Methods
Extraction and text analytics in Python
SVD Method
LDA / Latent Dirichlet Allocation
Topic Modelling
Dimensionality Reduction: Sentiment Extraction
Lexicons & Emotion Mining
Principal Component Analysis (PCA):
In-depth coverage of PCA, including eigenvalue Applications and Use Cases
decomposition and feature extraction.
Applications in reducing dimensionality.
Live Projects
t-Distributed Stochastic Neighbor Embedding (t-SNE):
Explanation of t-SNE and its use for visualizing high-dimensional data.
Comparison with other dimensionality reduction techniques.
8
dataminds.in dataminds 100% PLACEMENT ASSISTANCE dataminds_in dataminds.in
MODULE 07
Deep Learning
TRAINING | PLACEMENTS | CONSULTING
MODULE 07
Deep Learning
TRAINING | PLACEMENTS | CONSULTING
Computer Vision
Introduction to Vision
Importance of Image Processing
Image Processing Challenges – Interclass Variation, ViewPoint Variation, Illumination, Background
Clutter, Occlusion & Number of Large Categories
MODULE 08
ADVANCE
TRAINING | PLACEMENTS | CONSULTING
Speech Recognition
DALL-E
DALL-E is a groundbreaking generative model in the field of data science and artificial
intelligence, developed by OpenAI. The name "DALL-E" is a combination of the famous
artist Salvador Dalí and the robot character WALL-E from the Pixar film.
11
dataminds.in dataminds 100% PLACEMENT ASSISTANCE dataminds_in dataminds.in
DATABASE dataminds
TM
MODULE 09
TRAINING | PLACEMENTS | CONSULTING
3.History of MongoDB
4.Update Evolution and development of MongoDB
Modifying existing records in a table using the
UPDATE statement 4.Features of NoSQL Databases
Flexibility, scalability, and other key features
5.Delete of NoSQL databases
Removing records from a table using
the DELETE statement
12
dataminds.in dataminds 100% PLACEMENT ASSISTANCE dataminds_in dataminds.in
MODULE 10
TRAINING | PLACEMENTS | CONSULTING
BIG DATA
Section 1 8.Flask
Flask Introduction
1.Hadoop Introduction Overview and purpose of Flask
Definition and Purpose Flask Application
Historical Context Building a basic Flask application
Flask URL
2.Hadoop Architecture Handling URLs in Flask
Templates
Components: NameNode, DataNode,
Using templates in Flask
ResourceManager, NodeManager
Merging the ML Model
High-Level Architecture Overview
Integrating Flask with a
Machine Learning model
3.Hadoop Eco-system
Overview of various tools in the Section 2
Hadoop ecosystem (e.g., MapReduce, Hive, Pig) Amazon Web Services (AWS)
4.Hadoop Distributed File System (HDFS) 1.Cloud Computing
Basics of HDFS Definition and Characteristics
File Storage and Replication Cloud Service Models (IaaS, PaaS, SaaS)
13
dataminds.in dataminds 100% PLACEMENT ASSISTANCE dataminds_in dataminds.in
MODULE 10
BIG DATA TRAINING | PLACEMENTS | CONSULTING
Section 3:
Agile Scrum Methodology
1.Agile Introduction
Principles and Values of Agile
Agile Manifesto
2.Advantages of Agile
Benefits of Agile over traditional methodologies
3.Scrum Introduction
Framework Overview
Roles in Scrum: Scrum Master, Product Owner, Development Team
4.Scrum Process
Sprint Planning, Daily Standups, Sprint Review, Sprint Retrospective
5.Scrum Terminology
User Stories, Backlog, Burndown Charts, Sprint Backlog
Section 4:
Kafka
1.What is Message Service
Introduction to message-oriented middleware
2.Kafka Introduction
Overview of Apache Kafka
Messaging System for Distributed Streaming
3.Kafka Architecture
Components: Producers, Brokers, Consumers
Topics and Partitions
14
ABOUT US
dataminds
TM
Data Minds Analytics stands as a premier training institute located in Hyderabad, India.
We take pride in being a leading platform dedicated
TRAINING to offering
| PLACEMENTS professional courses.
| CONSULTING
Our professional courses are instructed by industry experts actively engaged in real-time
practices, utilizing the latest teaching tools and techniques. The combination of our
Learning Management System (LMS) and dedicated support mentors constitutes key
elements that facilitate easy and simplified learning
3000 + Successfully trained students
1056 + Facilitated career transitions
10+ Industry 4.0 diverse range of Digital Transformation courses
Flexible training options, including Classroom, Online, E-Learning, and Corporate Trainings.