TS CV
TS CV
Gearing towards Full Stack Data Scientist with 2 yrs experience & overall of 5 yrs, I have successfully delivered automation projects across various domains
using a wide range of Generative AI and ML algorithms. In recognition of my contributions, I was named Agile Employee of the Year - 2023 at Narola Infotech.
EDUCATION
Shivaji University, Kolhapur Aug 2012 - July2016
Bachelors of Production Engineering
SKILLS
Programming: Python(Pandas,Numpy,Spacy ,Scikit-learn,NLTK,Statsmodel,Matplotlib,Keras,Tensorflow,Pytorch,OpenCV,Pydub)
Django Rest framework, Flask, GIT, GitHub, Plotly, Selenium Web Scraping, Pytorch.
Algorithms: Supervised & Unsupervised Machine Learning Algorithms, Deep Learning, BERT, Transformers, Neural Networks, CNN, Hugging Face, Whisper,
ChatGPT, Natural Language Processing, Generative AI.
Cloud Services: AWS EC-2 Instance, S3 Bucket.
Statistical Stacks: Hypothesis Testing, Probability Theory, Time-Series Analysis, ANOVA, Data Mining.
Front End Stacks: HTML, CSS, Jquery, Bootstrap.
WORK EXPERIENCE
Data Scientist
Narola Infotech, Pune. November 2021-Present.
Kronos Chatbot
Project Overview: An AI driven chatbot which respond to customer queries with human-like intelligence by mapping inventory reducing waiting time.
Project Achievements:
Data Augmentation: Performed data augmentation on the client's data containing 580 possible questions using T5Conditional generation
from transformer, resulting in 3,480 augmented questions & labeled the augmented data accordingly.
Model Development: Fine-tuned the DistilBERT model on the labeled augmented data to classify intent and tuned it to an accuracy of up to
91%. Dumping the model using Pytorch was a challenging task.
Named Entity Recognition (NER): To extract attributes we trained a custom NER to enhance 88% F1 Scores for accurate outcome.
Database Management: Integrating MySQL database with Django Framework resulting to store structured data for ease of data operations.
Regex String Pattern: Optimized the matching accuracy upto 86% using custom regex to map inputs & collate with database by exceptions.
Chatbot Design: Collaborated to implement frontend by using HTML/CSS to improve customer engagement.
Cloud Deployment: Deployed the chatbot on an EC2 instance on AWS, which allowed for easy scaling and management of the application.
Impact: Improved customer relations & support services with reduced idle times by approximately 79% which constitutes to increase in sales.
Diseases Prediction
Project Overview: The objective of this project was to automate the process of diagnosing illnesses and identifying their treatments based on symptoms.
Project Achievements:
Data Engineering: Created ETL pipeline using Selenium web scraping to prepare data from the PubMed forum, reducing manual effort by 30%.
Custom NER Model: Optimized the model performance by 13% of symptoms detection by collaborating the pretrained STANZA Clinical model
& tuning it with custom NER.
Symptom Analysis: Defined the cosine similarity matrix to enhance 100% similarity matching for predicted and provided symptoms by client
& segmented them according to its diagnosis class.
Gender Analysis: Further analyzed the proportion of genders affected by each diagnosis and post-processed the diagnosis by validating it with
the client catalog using various dataframe transformation techniques.
Data Presentation: Utilized Python Excel writer to dump all the predicted results into a single Excel sheet, which is responsible to eliminate
60% of the time required by a physician.
API Testing: Django framework is utilized to generate API & assisted the designer for further frontend improvements for POC.
Impact: The automated AI diagnosis and treatment identification will improve patient outcomes by providing diagnosis and treatments based
on detected symptoms saving reducing 40% of physician’s time and effort in initial phase.
Software Developer
Tata Consultancy Services. August 2021-November 2021.
Project Overview: Video analytics system to detect BIB number and store snap of respective runner participant using AI.
Project Achievements:
OpenCV EAST Text Detection OCR: Implemented the OpenCV EAST text architecture OCR network to detect the BIB numbers of the participant
eliminating 75% manual effort to tag according to respective runners.
Image Processing Techniques: Resized the detected images into a blob and set a threshold of 0.80 confidence score for discovered regions.
Proceed the results through non-maximum suppression for refining the outcome.
String Conversion: Converted the located entity frame into a string using pytesseract to dump the image with the respective BIB number into
the database which eliminates 80% of time to deliver the runner’s snap by respective BIB number.
Impact: This system we developed using OpenCV EAST text detection OCR and pytesseract to extract and store runner participant BIB numbers
resulted in an increased efficiency of 75% and a reduction of 80% in human error.
Product Analyst & Designer
R.K.Polymer Industries Pvt Ltd, Pune April 2018-July2021.