Instructor: Renato Rocha Souza
This is the repository of code for the "Introduction to Data Science"
This class is about the Data Science process, in which we seek to gain useful predictions and insights from data. Through real-world examples and code snippets, we introduce methods for:
- data munging, scraping, sampling andcleaning in order to get an informative, manageable data set;
- data storage and management in order to be able to access data (even if big data);
- exploratory data analysis (EDA) to generate hypotheses and intuition about the data;
- prediction based on statistical learning tools;
- communication of results through visualization, stories, and interpretable summaries
Detailed Syllabus:
- Supervised
Bayesian Models
Neural Network concepts
- General Math
- Linear and dense layers
Data Science Tasks
Preparing the Environment
Versioning Tools
Exploratory Data Analysis Tools
Machine Learning Tools
NLP Tools
Graph Analysis Tools
Dashboards and UIs
Neural Networks visualization
Relational databases and SQL
NoSQL / Graph Databases
Data Wrangling and Distributed computing
Analytical Pipelines
We are using because the /datasets files can be large. Install it before the git clone.