Tech Assessment For Data Scientists - Analyst
Tech Assessment For Data Scientists - Analyst
Objective
Analyze the "Global Weather Repository.csv" dataset to forecast future weather trends and
showcase data science skills through both basic and advanced techniques. This dataset provides
Daily weather information for cities around the world. This dataset offers a comprehensive set of
features that reflect the weather conditions worldwide. It includes over 40 features.
Note: You can choose between completing the basic or advanced assessments. Showcasing
advanced analyses can reflect a higher level of skill, but fulfilling either set of requirements is
acceptable.
Dataset
● The dataset is available on the Kaggle website.
World Weather Repository:
https://www.kaggle.com/datasets/nelgiriyewithana/global-weather-repository/code
Assessment Details
Basic Assessment
Data Cleaning & Preprocessing
● Handle missing values, outliers, and normalize data.
Exploratory Data Analysis (EDA)
● Perform basic EDA to uncover the trends, correlations, and patterns.
● Generate visualizations for temperature and precipitation.
Model Building
● Build a basic forecasting model and evaluate its performance using different metrics.
● Use lastupdated feature for the time series analysis.
Advanced Assessment
Advanced EDA
● Implement anomaly detection to identify and analyze outliers.
Forecasting with Multiple Models
● Build and compare multiple forecasting models
● Create an ensemble of models to improve forecast accuracy.
Unique Analyses
● Climate Analysis: Study long-term climate patterns and variations in different
regions.
● Environmental Impact: Analyze air quality and its correlation with various weather
parameters.
● Feature Importance: Apply different techniques to assess feature importance.
● Spatial Analysis: Analyze and visualize geographical patterns in the data.
● Geographical Patterns: Explore how weather conditions differ across countries and
continents.
Deliverable:
● Display the PM Accelerator mission on the report/presentation/dashboard. You can
find it on the website.
● Create a simple report or presentation that includes all analyses, model evaluations,
and visualizations.
● Explain the data cleaning, EDA, forecasting models, advanced analyses, and insights
in a well-organized format.
● Submit the report or presentation in a Github repository or project folder. Include a
detailed `README.md` or equivalent documentation explaining the project,
methodology, and results.
● Share the link to the repository or project folder by the submission deadline.