Data Science ML Full Stack 2022 GitHub
Data Science ML Full Stack 2022 GitHub
Code Issues Pull requests Actions Projects Wiki Security Insights Settings
master
Data-Science-ML-Full-Stack-2022 / README.md
hemansnation
Update README.md
1
contributor
1. PythonProgrammingandLogicBuilding
2. DataStructure&Algorithms
3. PandasNumpyMatplotlib
4. Statistics
5. MachineLearning
6. NaturalLanguageProcessing
7. ComputerVision
8. DataVisualizationwithTableau
9. StructureQueryLanguage(SQL)
10. BigDataandPySpark
11. DevelopmentOperationswithAzure
12. FiveMajorProjectsandGit
TechnologyStack
Python
DataStructures
NumPy
Pandas
Matplotlib
Seaborn
Scikit-Learn
Statsmodels
NaturalLanguageToolkit(NLTK)
PyTorch
OpenCV
Tableau
StructureQueryLanguage(SQL)
PySpark
AzureFundamentals
AzureDataFactory
Databricks
5MajorProjects
GitandGitHub
1 | Python Programming and Logic
Building
I will prefer Python Programming Language. Python is the best for starting your
programming journey. Here is the roadmap of python for logic building.
Numpy
Vectors, Matrix
Operations on Matrix
Mean, Variance, and Standard Deviation
Reshaping Arrays
Transpose and Determinant of Matrix
Diagonal Operations, Trace
Add, Subtract, Multiply, Dot, and Cross Product.
Pandas
Series and DataFrames
Slicing, Rows, and Columns
Operations on DataFrame
Different ways to create DataFrame
Read, Write Operations with CSV files
Handling Missing values, replace values, and Regular Expression
GroupBy and Concatenation
Matplotlib
Graph Basics
Format Strings in Plots
Label Parameters, Legend
Bar Chart, Pie Chart, Histogram, Scatter Plot
4 | Statistics
Descriptive Statistics
Measure of Frequency and Central Tendency
Measure of Dispersion
Probability Distribution
Gaussian Normal Distribution
Skewness and Kurtosis
Regression Analysis
Continuous and Discrete Functions
Goodness of Fit
Normality Test
ANOVA
Homoscedasticity
Linear and Non-Linear Relationship with Regression
Inferential Statistics
t-Test
z-Test
Hypothesis Testing
Type I and Type II errors
t-Test and its types
One way ANOVA
Two way ANOVA
Chi-Square Test
Implementation of continuous and categorical data
5 | Machine Learning
The best way to master machine learning algorithms is to work with the Scikit-Learn
framework. Scikit-Learn contains predefined algorithms and you can work with them
just by generating the object of the class. These are the algorithm you must know
including the types of Supervised and Unsupervised Machine Learning:
Linear Regression
Logistic Regression
Decision Tree
Gradient Descent
Random Forest
Ridge and Lasso Regression
Naive Bayes
Support Vector Machine
KMeans Clustering
Sentiment analysis
POS Tagging, Parsing,
Text preprocessing
Stemming and Lemmatization
Sentiment classification using Naive Bayes
TF-IDF, N-gram,
Machine Translation, BLEU Score
Text Generation, Summarization, ROUGE Score
Language Modeling, Perplexity
Building a text classifier
Identifying the gender
7 | Computer Vision
To work on image and video analytics we can master computer vision. To work on
computer vision we have to understand images.
PyTorch Tensors
Understanding Pretrained models like AlexNet, ImageNet, ResNet.
Neural Networks
Building a perceptron
Building a single layer neural network
Building a deep neural network
Recurrent neural network for sequential data analysis
PySpark
Resilient Distributed Datasets
Schema
Lambda Expressions
Transformations
Actions
Data Modeling
Duplicate Data
Descriptive Analysis on Data
Visualizations
ML lib
ML Packages
Pipelines
Streaming
Packaging Spark Applications
We follow project-based learning and we will work on all the projects in parallel.
Join the Data Science & ML Full Stack WhatsApp Group here:
https://bit.ly/3qxKEFP
https://bit.ly/3qxKEFP
LinkedIn: https://www.linkedin.com/in/hemansnation/
Twitter: https://twitter.com/hemansnation
GitHub: https://github.com/hemansnation
Instagram: https://www.instagram.com/masterdexter.ai/