Resume 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Shreejith S

Data Analyst
9482310468 [email protected] Bengaluru, IN Shreejith S Github repository

SUMMARY
Detail-oriented Mechanical Engineer with 3.5+ years of experience transitioning into Data Analytics. Proficient in data analysis,
visualization, and problem-solving using tools like Python, Excel, and SQL. Adept at leveraging engineering skills to extract insights, drive
efficiency, and support data-driven decision-making in diverse environments.

KEY SKILLS
• Data Analysis • Problem Solving • Data Visualization
• Machine Learning• Statistical Analysis

TECHNICAL SKILLS
Tools/Languages: Python,MySQL,MS Excel,Hyperworks,ANSYS,Teamcenter VisMockup,AWS,Hadoop,Apache Spark(RDDs and Structured
APIs)
CAE analysis solver: Optistruct
Database: MySQL

Professional Experience
• Virtual Build Validation Engineer
TCS – Bangalore
Nov'20– Present
• I am part of the virtual validation process for automotive build projects, ensuring accuracy and alignment with engineering
specifications and design intent.
• Collaborated with design team, to identify potential issues early in the product development cycle, reducing rework and time-to-
market.
• Utilized Teamcenter VisMockup for validating fit, form, and function before physical builds.
• Managed the integration of complex subsystems into the virtual vehicle model, ensuring seamless assembly and operation within
established tolerances.
• Mentored junior engineers, providing guidance on virtual build techniques and best practices, contributing to team development and
knowledge sharing.

INDUSTRY PROJECTS
Domain: Banking|Exploratory Data Analysis (EDA) on Loan Application Data| Tech Stack: Python(pandas, numpy, matplotlib, seaborn)|
Apr'24
• Objective: Analysing banking dataset to help bank decide who should be provided with loan
• Solution: Performed EDA(Exploratory Data Analysis) to identify data quality issues, data structure, and initial insights.
○ Data Cleaning and Preprocessing: Loaded and assessed the dataset for structure, missing values, and duplicates. Dropped columns
with more than 40% missing data to streamline the analysis and ensure data reliability.
○ Data Exploration and Insight Generation:Explored data distributions, correlations, and unique values across features like loan
amount, applicant demographics, and credit scores.Visualized relationships between key variables (e.g., income vs. loan amount)
using scatter plots, histograms, and heatmaps to identify potential predictors.
○ Null Value Management: Quantified missing values across all columns and applied targeted removal to improve data quality.
○ Result: Provided a cleaned dataset with clear insights into applicant profiles and patterns in loan approvals, informing potential
modeling efforts or further business analysis.
Domain: Entertainment|SQL Data Exploration and Analysis on IMDb Database |Tech Stack: MySQL | Jun '24
• Objective: Suggest movie production house what factors should be taken care of if they want to make movies for global audience
• Solution:
○ Data Profiling and Table Analysis:Queried row counts for all major tables (movie, genre, ratings, etc.) to understand data volume
and evaluate completeness.
Identified key dimensions and metrics, such as movie counts by genre and director, creating a structured view of the data available.
○ Null Value Detection and Data Quality Check:Used conditional aggregation to calculate the number of nulls across important
columns (e.g., title, release_date), providing insights into data gaps.Isolated incomplete records for further review and
determined impact on analytical accuracy.
○ Complex Query Execution:Extracted key insights from director_mapping and role_mapping tables to understand casting
distributions and director involvement.Joined movie and ratings tables to uncover trends in ratings across genres, highlighting
relationships that could inform content recommendations.
Overall, around 4,285 movies were released in last 3 years in ‘Drama’ genre and RSVP movies must choose ‘Drama’ and average duration
will be < 2 hrs for their global project. Other genre Comedy or Thriller can also be an option as these are among Top 3 genre.
Domain: Personal Rides| Tech Stack: Python| Jul '24
• Objective: Recommend strategies to improve business of a bike sharing company
• Solution: Used Linear Regression to predict the demand of bike.
○ Data Understanding and Initial Exploration:Loaded and explored a dataset of daily bike rentals, examining the structure,
distributions, and relationships among features (e.g., season, weather conditions, holiday, and working day indicators).
Used descriptive statistics and .info() checks to confirm data completeness and identify potential null values or outliers for
preprocessing.
○ Data Preprocessing and Feature Engineering:Converted categorical variables, such as season and weather, into meaningful labels
based on a data dictionary, making the data more interpretable and model-ready.Created additional engineered features, such as
interaction terms between temperature and season, to capture underlying patterns and improve model performance.
Standardized and normalized numerical features to facilitate smoother model training and to address variance issues.
○ Exploratory Data Analysis (EDA):Visualized relationships between features and rental counts using histograms, scatter plots, and
box plots to identify influential variables and potential non-linear relationships.Calculated correlation matrices to assess
multicollinearity and inform feature selection for modeling, focusing on high-impact features like temperature, humidity, and
seasonality.
○ Model Development and Evaluation:Built and evaluated multiple regression models, including Ordinary Least Squares (OLS) Linear
Regression, and applied variance inflation factors (VIF) to minimize multicollinearity issues.
Assessed models using metrics such as Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R², refining the model
through hyperparameter tuning and regularization.
Iterated model versions to select the best-performing model, achieving a reliable prediction accuracy that aligns with real-world
usage patterns.
• Key achievement: Achieved R-squared value of 84.2% and adjusted R-squared value of 83.8%.

Domain: Education firm| Tech Stack: Python | Aug '24


• Objective: Improve business of an education platform by identifying HOT leads
• Solution: Used Logistic regression to find out probability of conversion of a lead and calculated the lead score.
Data Preparation:Imported and cleaned the dataset (Leads.csv), which involved handling missing values, analyzing feature
distributions, and transforming categorical features.
Exploratory Data Analysis (EDA):Performed an analysis to understand the significant variables affecting lead conversion.
Visualized data using Matplotlib and Seaborn to gain insights into key conversion factors (like lead source, lead activity, etc.).
Identified strong indicators of lead conversion based on trends in the dataset.
Model Building and Evalutation:Developed a logistic regression model to assign lead scores between 0 and 100.Employed Recursive
Feature Elimination (RFE) to select the most relevant features for the model.Split the data into training and testing sets to evaluate
the model's performance.Used precision-recall curves, accuracy, precision, and recall metrics to evaluate the model's performance.
• Key achievment: Obtained an accuracy of 91%,sensitivity of 89%,specificity of 93%.
Domain: Automotive| Tech Stack: Tableau | Sep '24
• Objective: Find out a couple of KPI's like Average electrical range, Total BEV's, Total PHEV's and find the market leader in EVs .
• Solution: Used Tableau to calculate the average electric range , distribution of BEV's across different states of US.
• Key achievment: Was able to determine that Tesla is the market leader of EV's which makes almost 63% of the total EV's in the market.

EDUCATION
Advanced certification program in Data Science Feb '24 - Oct '24
IIIT Bangalore & upGrad Bengaluru, IN
• Course Modules:
○ Data Analysis using SQL | Introduction to Python | Introduction to Machine Learning and Linear Regression
○ Telecom Churn Case Study | Lexical Processing | Syntactic Processing
○ Basics of AWS | Apache Spark | Hadoop

Bachelor of Engineering in Mechanical Engineering Jun '16 - Aug '20


National Institute of Engineering Mysore,Karnataka, IN
• Secured 8.7 CGPA

CERTIFICATIONS
• Data science programming bootcamp : Python and SQL
• Earned a SQL 50 badge on leetcode

ADDITIONAL INFORMATION
• Languages: English ,Kannada,Marathi,Tulu

You might also like